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FIELD OF THE INVENTION 

The invention relates to polynucleotides and the polypeptides encoded by such 
polynucleotides, as well as vectors, host cells, antibodies and recombinant methods for producing 
the polypeptides and polynucleotides, as well as methods for using the same. 

BACKGROUND OF THE INVENTION 

The present invention is based in part on nucleic acids encoding proteins that are new 
members of the following protein families: delta serrate ligand receptors, protein kinases, G- 
protein coupled receptors (GPCR), ankyrin repeat containing proteins, TNF intracellular domain 
interacting proteins, secretory proteins and dual specificity phosphatases. More particularly, the 
invention relates to nucleic acids encoding novel polypeptides, as well as vectors, host cells, 
antibodies, and recombinant methods for producing these nucleic acids and polypeptides. 

SUMMARY OF THE INVENTION 

The invention is based in part upon the discovery of nucleic acid sequences encoding 

novel polypeptides. The novel nucleic acids and polypeptides are referred to herein as NOVX, or 
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N0V1, NOV2, NOV3, NOV4, NOV5, NOV6, NOV7, NOV8, and NOV9 nucleic acids and 
polypeptides. These nucleic acids and polypeptides, as well as derivatives, homologs, analogs 
and fragments thereof, will hereinafter be collectively designated as "NOVX" nucleic acid or 
polypeptide sequences, 

5 In one aspect, the invention provides an isolated NOVX nucleic acid molecule encoding a 

NOVX polypeptide that includes a nucleic acid sequence that has identity to the 
nucleic acids disclosed in SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19,21,23,25, 27 and 29. 
Protein phosphorylation is a fundamental process for the regulation of cellular functions. The 
coordinated action of both protein kinases and phosphatases controls the levels of phosphorylation 

fd and, hence, the activity of specific target proteins. One of the predominant roles of protein 

o 

£3 phosphorylation is in signal transduction, where extracellular signals are amplified and 

% I propagated by a cascade of protein phosphorylation and dephosphorylation events. Eukaryotic 

0] protein kinases are enzymes that belong to a very extensive family of proteins which share a 

s|as 

fjQ conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There 
%5 are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal 
H extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a 
m lysine residue, which has been shown to be involved in ATP binding. In the central part of the 
;i catalytic domain there is a conserved aspartic acid residue which is important for the catalytic 

activity of the enzyme. In some embodiments, the NOVX nucleic acid molecule will hybridize 
20 under stringent conditions to a nucleic acid sequence complementary to a nucleic acid molecule 
that includes a protein-coding sequence of a NOVX nucleic acid sequence. The invention also 
includes an isolated nucleic acid that encodes a NOVX polypeptide, or a fragment, homolog, 
analog or derivative thereof. For example, the nucleic acid can encode a polypeptide at least 80% 
identical to a polypeptide comprising the amino acid sequences of SEQ ID NOS:2, 4, 6, 8, 10, 12, 
25 14, 16, 18, 20, 22, 24, 26, 28 and 30. The nucleic acid can be, for example, a genomic DNA 

fragment or a cDNA molecule that includes the nucleic acid sequence of any of SEQ ID NOS:l, 
3, 5, 7, 9, 11, 13, 15, 17, 19,21,23, 25, 27 and 29. 

Also included in the invention is an oligonucleotide, e.g., an oligonucleotide which 
includes at least 6 contiguous nucleotides of a NOVX nucleic acid (e.g., SEQ ID NOS:l, 3, 5, 7, 
30 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29) or a complement of said oligonucleotide. Also 

included in the invention are substantially purified NOVX polypeptides (SEQ ID NOS:2, 4, 6, 8, 
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10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30). In certain embodiments, the NOVX polypeptides 
include an amino acid sequence that is substantially identical to the amino acid sequence of a 
human NOVX polypeptide. 

The invention also features antibodies that immunoselectively bind to NOVX 
5 polypeptides, or fragments, homologs, analogs or derivatives thereof. 

In another aspect, the invention includes pharmaceutical compositions that include 
therapeutically- or prophylactically-effective amounts of a therapeutic and a pharmaceutically- 
acceptable carrier. The therapeutic can be, e.g., a NOVX nucleic acid, a NOVX polypeptide, or 
an antibody specific for a NOVX polypeptide. In a further aspect, the invention includes, in one 
W) or more containers, a therapeutically- or prophylactically-effective amount of this pharmaceutical 

2 ; 

%% composition. 

ljl In a further aspect, the invention includes a method of producing a polypeptide by 

ru 

ni culturing a cell that includes a NOVX nucleic acid, under conditions allowing for expression of 

% the NOVX polypeptide encoded by the DNA. If desired, the NOVX polypeptide can then be 

45 recovered. 

Li.. In another aspect, the invention includes a method of detecting the presence of a NOVX 

'vl polypeptide in a sample. In the method, a sample is contacted with a compound that selectively 

O binds to the polypeptide under conditions allowing for formation of a complex between the 

as : 

polypeptide and the compound. The complex is detected, if present, thereby identifying the 
20 NOVX polypeptide within the sample. 

The invention also includes methods to identify specific cell or tissue types based on their 
expression of a NOVX. 

Also included in the invention is a method of detecting the presence of a NOVX nucleic 
acid molecule in a sample by contacting the sample with a NOVX nucleic acid probe or primer, 
25 and detecting whether the nucleic acid probe or primer bound to a NOVX nucleic acid molecule 
in the sample. 

In a further aspect, the invention provides a method for modulating the activity of a 
NOVX polypeptide by contacting a cell sample that includes the NOVX polypeptide with a 
compound that binds to the NOVX polypeptide in an amount sufficient to modulate the activity of 
30 said polypeptide. The compound can be, e.g., a small molecule, such as a nucleic acid, peptide, 
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polypeptide, peptidomimetic, carbohydrate, lipid or other organic (carbon containing) or inorganic 
molecule, as further described herein. 

Also within the scope of the invention is the use of a therapeutic in the manufacture of a 
medicament for treating or preventing disorders or syndromes including, e.g., trauma, 
5 regeneration (in vitro and in vivo), viral/bacterial/parasitic infections, Von Hippel-Lindau (VHL) 
syndrome, Alzheimer's disease, stroke, Tuberous sclerosis, hypercalcemia, Parkinson's disease, 
Huntington's disease, Cerebral palsy, Epilepsy, Lesch-Nyhan syndrome, multiple sclerosis, 
Ataxia-telangiectasia, leukodystrophies, behavioral disorders, addiction, anxiety, pain, actinic 
keratosis, acne, hair growth diseases, allopecia, pigmentation disorders, endocrine disorders, 
^M) connective tissue disorders, such as severe neonatal Marfan syndrome, dominant ectopia lentis, 
familial ascending aortic aneurysm, isolated skeletal features of Marfan syndrome, Shprintzen- 

y \ 

Pj Goldberg syndrome, genodermatoses, contractural arachnodactyly, inflammatory disorders such 
% as osteo- and rheumatoid-arthritis, inflammatory bowel disease, Crohn's disease; immunological 
53 disorders, AIDS; cancers including but not limited to lung cancer, colon cancer, neoplasm; 
fj5 adenocarcinoma; lymphoma; prostate cancer; uterus cancer, leukemia or pancreatic cancer; blood 
r? disorders; asthma; psoriasis; vascular disorders, hypertension, skin disorders, renal disorders 
S3 including Alport syndrome, immunological disorders, tissue injury, fibrosis disorders, bone 
m diseases, Ehlers-Danlos syndrome type VI, VII, type IV, S-linked cutis laxa and Ehlers-Danlos 

syndrome type V, osteogenesis imperfecta, neurologic diseases, brain and/or autoimmune 
20 disorders like encephalomyelitis, neurodegenerative disorders, immune disorders, hematopoietic 
disorders, muscle disorders, inflammation and wound repair, bacterial, fungal, protozoal and viral 
infections (particularly infections caused by HIV-1 or HIV-2), pain, acute heart failure, 
hypotension, hypertension, urinary retention, osteoporosis, treatment of Albright hereditary 
ostoeodystrophy, angina pectoris, myocardial infarction, ulcers, benign prostatic hypertrophy, 
25 arthrogryposis multiplex congenita, osteogenesis imperfecta, keratoconus, scoliosis, duodenal 
atresia, esophageal atresia, intestinal malrotation, pancreatitis, obesity systemic lupus 
erythematosus, autoimmune disease, emphysema, scleroderma, allergy, ARDS, neuroprotection, 
fertility Myasthenia gravis, diabetes, obesity, growth and reproductive disorders hemophilia, 
hypercoagulation, idiopathic thrombocytopenic purpura, immunodeficiencies, graft vesus host, 
30 adrenoleukodystrophy, congenital adrenal hyperplasia, endometriosis, xerostomia, ulcers, 

cirrhosis, transplantation, diverticular disease, Hirschsprung's disease, appendicitis, arthritis, 
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ankylosing spondylitis, tendinitis, renal artery stenosis, interstitial nephritis, glomerulonephritis, 
polycystic kidney disease, erythematosus, renal tubular acidosis, IgA nephropathy, anorexia, 
bulimia, psychotic disorders, including anxiety, schizophrenia, manic depression, delirium, 
dementia, severe mental retardation and dyskinesias, such as Huntington's disease and/or other 
5 pathologies and disorders of the like. 

The therapeutic can be, e.g., a NOVX nucleic acid, a NOVX polypeptide, or a NOVX- 
specific antibody, or biologically-active derivatives or fragments thereof. 

For example, the compositions of the present invention will have efficacy for treatment of 
patients suffering from the diseases and disorders disclosed above and/or other pathologies and 
15 disorders of the like. The polypeptides can be used as immunogens to produce antibodies specific 

las? 

Q for the invention, and as vaccines. They can also be used to screen for potential agonist and 
flj antagonist compounds. For example, a cDNA encoding NOVX may be useful in gene therapy, 

and NOVX may be useful when administered to a subject in need thereof. By way of non- 
03 limiting example, the compositions of the present invention will have efficacy for treatment of 
ft§ patients suffering from the diseases and disorders disclosed above and/or other pathologies and 
f* disorders of the like. 

03 The invention further includes a method for screening for a modulator of disorders or 

fu syndromes including, e.g., the diseases and disorders disclosed above and/or other pathologies 

and disorders of the like. The method includes contacting a test compound with a NOVX 
20 polypeptide and determining if the test compound binds to said NOVX polypeptide. Binding of 

the test compound to the NOVX polypeptide indicates the test compound is a modulator of 

activity, or of latency or predisposition to the aforementioned disorders or syndromes. 

Also within the scope of the invention is a method for screening for a modulator of 

activity, or of latency or predisposition to disorders or syndromes including, e.g., the diseases and 
25 disorders disclosed above and/or other pathologies and disorders of the like by administering a 

test compound to a test animal at increased risk for the aforementioned disorders or syndromes. 

The test animal expresses a recombinant polypeptide encoded by a NOVX nucleic acid. 

Expression or activity of NOVX polypeptide is then measured in the test animal, as is expression 

or activity of the protein in a control animal which recombinantly-expresses NOVX polypeptide 
30 and is not at increased risk for the disorder or syndrome. Next, the expression of NOVX 

polypeptide in both the test animal and the control animal is compared. A change in the activity 
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of NOVX polypeptide in the test animal relative to the control animal indicates the test compound 
is a modulator of latency of the disorder or syndrome. 

In yet another aspect, the invention includes a method for determining the presence of or 
predisposition to a disease associated with altered levels of a NOVX polypeptide, a NOVX 
5 nucleic acid, or both, in a subject {e.g., a human subject). The method includes measuring the 
amount of the NOVX polypeptide in a test sample from the subject and comparing the amount of 
the polypeptide in the test sample to the amount of the NOVX polypeptide present in a control 
sample. An alteration in the level of the NOVX polypeptide in the test sample as compared to the 
control sample indicates the presence of or predisposition to a disease in the subject. Preferably, 

© the predisposition includes, e.g., the diseases and disorders disclosed above and/or other 

O 

Q pathologies and disorders of the like. Also, the expression levels of the new polypeptides of the 

i It 

S l invention can be used in a method to screen for various cancers as well as to determine the stage 

y : of cancers. 

03 In a further aspect, the invention includes a method of treating or preventing a pathological 

l| condition associated with a disorder in a mammal by administering to the subject a NOVX 

las? * ^ 

p polypeptide, a NOVX nucleic acid, or a NOVX-specific antibody to a subject (e.g. , a human 
m subject), in an amount sufficient to alleviate or prevent the pathological condition. In preferred 
fj" embodiments, the disorder, includes, e.g., the diseases and disorders disclosed above and/or other 

pathologies and disorders of the like. 
20 In yet another aspect, the invention can be used in a method to identity the cellular 

receptors and downstream effectors of the invention by any one of a number of techniques 
commonly employed in the art. These include but are not limited to the two-hybrid system, 
affinity purification, co-precipitation with antibodies or other specific-interacting molecules. 

NOVX nucleic acids and polypeptides axe further useful in the generation of antibodies 
25 that bind immuno-specifically to the novel NOVX substances for use in therapeutic or diagnostic 
methods. These NOVX antibodies may be generated according to methods known in the art, 
using prediction from hydrophobicity charts, as described in the "Anti-NOVX Antibodies" 
section below. The disclosed NOVX proteins have multiple hydrophilic regions, each of which 
can be used as an immunogen. These NOVX proteins can be used in assay systems for functional 
30 analysis of various human disorders, which will help in understanding of pathology of the disease 
and development of new drug targets for various disorders. 
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The NOVX nucleic acids and proteins identified here may be useful in potential 
therapeutic applications implicated in (but not limited to) various pathologies and disorders as 
indicated below. The potential therapeutic applications for this invention include, but are not 
limited to: protein therapeutic, small molecule drug target, antibody target (therapeutic, 
diagnostic, drug targe ting/cytotoxic antibody), diagnostic and/or prognostic marker, gene therapy 
(gene delivery/gene ablation), research tools, tissue regeneration in vivo and in vitro of all tissues 
and cell types composing (but not limited to) those defined here. 

Unless otherwise defined, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art to which this invention 
belongs. Although methods and materials similar or equivalent to those described herein can be 
used in the practice or testing of the present invention, suitable methods and materials are 
described below. All publications, patent applications, patents, and other references mentioned 
herein are incorporated by reference in their entirety. In the case of conflict, the present 
specification, including definitions, will control. In addition, the materials, methods, and 
examples are illustrative only and not intended to be limiting. 

Other features and advantages of the invention will be apparent from the following 
detailed description and claims. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides novel nucleotides and polypeptides encoded thereby. 
Included in the invention are the novel nucleic acid sequences and their encoded polypeptides. 
The sequences are collectively referred to herein as "NOVX nucleic acids" or "NOVX 
polynucleotides" and the corresponding encoded polypeptides are referred to as "NOVX 
polypeptides" or "NOVX proteins." Unless indicated otherwise, "NOVX" is meant to refer to any 
of the novel sequences disclosed herein. Table A provides a summary of the NOVX nucleic acids 
and their encoded polypeptides. 



TABLE L Sequences and Corresponding SEQ ID Numbers 



NOVX 
No. 


Interna] Acc. No. 


Homology 


Nucleic 
Acid 

SEQID 
NO. 


Polypeptide 
SEQ ID 
NO. 


la 


COR87920446_A 


Delta serrate ligand 
receptor 


1 


2 



7 



lb 


CG57012-01 


Delta serrate ligand 
receptor 


3 


4 


lc 


CG57012-02 


Delta serrate ligand 
receptor 


5 


6 


Id 


CG57012-03 


Delta serrate ligand 
receptor 


7 


8 


le 


CG57012-04 


Delta serrate ligand 
receptor 


9 ~~ 1 


10 


2 


COR87940554 


Protein kinase 


11 


12 


3 


COR100339661 


GPCR 


13 


14 


4a 


COR87934767 


Ankyrin repeat containing 
protein 


15 


16 


4b 


CG57238-01 


Ankyrin repeat containing 
protein 


17 


18 


5 


C0R1 00396092 


Ankyrin repeat containing 
protein 


19 


20 


6 


COR87941483 


TNF intracellular domain 
interacting protein 


21 


22 


7 


COR101716725 


Secretory protein 


23 


24 


8a 


CG56663-01 


GPCR 


25 


26 


8b 


CG56663-02 


GPCR 


27 


28 


9 


CG56787 01 


Dual specificity 
phosphatase 


29 


30 



NOVX nucleic acids and their encoded polypeptides are useful in a variety of applications 
and contexts. The various NOVX nucleic acids and polypeptides according to the invention are 
useful as novel members of the protein families according to the presence of domains and 
sequence relatedness to previously described proteins. Additionally, NOVX nucleic acids and 
polypeptides can also be used to identify proteins that are members of the family to which the 
NOVX polypeptides belong. 

NOVla to NOVle are homologous to the Delta serrate ligand receptor family of proteins. 
Thus, the NOVla to NOVle nucleic acids, polypeptides, antibodies and related compounds 
according to the invention are useful in potential diagnostic and therapeutic applications 
implicated in, for example, cardiovascular disease, Alagille syndrome, neural development 
defects, other developmental defects and other diseases, disorders and conditions of the like. 

NOV2 is homologous to Protein kinases. Therefore, the nucleic acids and proteins of the 
invention are useful in potential diagnostic and therapeutic applications implicated in, for 
example, Hypercalcemia, Ulcers, Hemophilia, hypercoagulation, idiopathic thrombocytopenic 
purpura, autoimmume disease, allergies, immunodeficiencies, transplantation, Graft versus host 
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disease (GVHD), Lymphaedema, Systemic lupus erythematosus , Autoimmune disease, Asthma, 
Emphysema, Scleroderma, allergy, Diabetes, Autoimmune disease, Renal artery stenosis, 
Interstitial nephritis, Glomerulonephritis, Polycystic kidney disease, Systemic lupus 
erythematosus, Renal tubular acidosis, IgA nephropathy, Cardiovascular disease, Hypercalcemia, 
5 Lesch-Nyhan syndrome, Fertility, Cancer and other diseases, disorders and conditions of the like. 

NOV3, NOV8a and NOV8b are homologous to GPCRs. Thus, the NOV3, NOV8a and 
NOV8b nucleic acids and polypeptides, antibodies and related compounds according to the 
invention will be useful in therapeutic and diagnostic applications implicated in, for example, Von 
Hippel-Lindau (VHL) syndrome, Cirrhosis,Transplantation, Hemophilia, Hypercoagulation, 

030 Idiopathic thrombocytopenic purpura, Immunodeficiencies, Graft versus host disorders and other 

m diseases, disorders and conditions of the like. 

*J£ NOV4a, NOV4b and NOV5 are homologous to the Ankyrin repeat containing proteins. 

JS Thus, NOV4a, NOV4b and NOV5 nucleic acids, polypeptides, antibodies and related compounds 
** according to the invention will be useful in therapeutic and diagnostic applications implicated in, 
C§5 for example, Endometriosis, Fertility, Von Hippel-Lindau (VHL) syndrome, Alzheimer's disease, 
_U Stroke, Tuberous sclerosis, hypercalcemia, Parkinson's disease, Huntington's disease, Cerebral 
%t palsy, Epilepsy,Lesch-Nyhan syndrome, Multiple sclerosis, Ataxia-telangiectasia, 
-J Leukodystrophies, Behavioral disorders, Addiction, Anxiety, Pain, Neuroprotection, Systemic 

lupus erythematosus, Autoimmune disease, Asthma, Emphysema, Scleroderma, allergy, and other 
20 diseases, disorders and conditions of the like. 

NOV6 is homologous to the TNF intracellular domain interaction proteins. Thus NOV6 
nucleic acids, polypeptides, antibodies and related compounds according to the invention will be 
useful in therapeutic and diagnostic applications implicated in, for example, cardio-vascular 
disorders, Cardiomyopathy, Atherosclerosis, Hypertension, Congenital heart defects, Aortic 
25 stenosis, Atrial septal defect (ASD), Atrioventricular (A-V) canal defect, Ductus arteriosus , 
Pulmonary stenosis , Subaortic stenosis, Ventricular septal defect (VSD), valve diseases, 
Tuberous sclerosis, Scleroderma, Obesity, Transplantation, Systemic lupus erythematosus , 
Autoimmune disease, Asthma, Emphysema, Scleroderma, allergy, Diabetes, Autoimmune 
disease, Renal artery stenosis, Interstitial nephritis, Glomerulonephritis, Polycystic kidney 
30 disease, Systemic lupus erythematosus, Renal tubular acidosis, IgA nephropathy, Hypercalcemia, 
Lesch-Nyhan syndrome and other diseases, disorders and conditions of the like. 
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N0V7 is homologous to Secretory proteins. Thus, the NOV7 nucleic acids, polypeptides, 
antibodies and related compounds according to the invention will be useful in therapeutic and 
diagnostic applications implicated in, for example, cardio-vascular diseases, Cardiomyopathy, 
Atherosclerosis, Hypertension, Congenital heart defects, Aortic stenosis, Atrial septal defect 
(ASD), Atrioventricular (A-V) canal defect, Ductus arteriosus , Pulmonary stenosis , Subaortic 
stenosis, Ventricular septal defect (VSD), valve diseases, Tuberous sclerosis, Scleroderma, 
Obesity, Transplantation, Systemic lupus erythematosus , Autoimmune disease, Asthma, 
Emphysema, Scleroderma, allergy, Diabetes, Autoimmune disease, Renal artery stenosis, 
Interstitial nephritis, Glomerulonephritis, Polycystic kidney disease, Systemic lupus 
erythematosus, Renal tubular acidosis, IgA nephropathy, Hypercalcemia, Lesch-Nyhan syndrome 
and other diseases, disorders and conditions of the like. 

NOV9 is homologous to Dual specificity phosphatase. Thus, the NOV9 nucleic acids, 
polypeptides, antibodies and related compounds according to the invention will be useful in 
therapeutic and diagnostic applications implicated in, for example, the treatment of patients 
suffering from: brain disorders including epilepsy, eating disorders, schizophrenia, ADD, and 
cancer; heart disease; blood disorders, kidney disorders, liver diseases, inflammation and 
autoimmune disorders including Crohn's disease, IBD, allergies, rheumatoid and osteoarthritis, 
inflammatory skin disorders, allergies, blood disorders; psoriasis; colon-, ovarian-, testicular-, 
lymphatic-, brain-, and pancreatic cancers; leukemia AIDS; thalamus disorders; metabolic 
disorders including diabetes and obesity; lung diseases such as asthma, emphysema, cystic 
fibrosis, and cancer; pancreatic disorders including pancreatic insufficiency; and prostate 
disorders including prostate cancer and other diseases, disorders and conditions of the like. 

The NOVX nucleic acids and polypeptides can also be used to screen for molecules, 
which inhibit or enhance NOVX activity or function. Specifically, the nucleic acids and 
polypeptides according to the invention may be used as targets for the identification of small 
molecules that modulate or inhibit, e.g., neurogenesis, cell differentiation, cell proliferation, 
hematopoiesis, wound healing and angiogenesis. 

Additional utilities for the NOVX nucleic acids and polypeptides according to the 
invention are disclosed herein. 
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NOV1 

One NOVX protein of the invention, referred to herein as NOV1, includes five delta 
serrate ligand receptors. The disclosed proteins have been named NOVla, NOVlb, NOVlc, 
NOVldand NOVle. 



NO VI a 

A disclosed NOVla (designated CuraGen Acc. No. COR87920446_A), which encodes a 
novel delta serrate ligand receptor and includes the 3063 nucleotide sequence (SEQ ID NO:l) is 
shown in Table 1 A. An open reading frame for the mature protein was identified beginning with 
an ATG initiation codon at nucleotides 1-3 and ending with a TGA codon at nucleotides 3061- 
3063. Putative untranslated regions, if any, are found upstream from the initiation codon and 
downstream from the termination codon and are underlined in Table 1 A, and the start and stop 
codons are in bold letters. 



Table 1 A. NOVla Nucleotide Sequence (SEQ ID NO:l) 



ATGTCACCGCCTCTGTGTCCCCTCCTTCTCCTGGCTGTGGGCCTGCGGCTGGCTGGAACTCTCAACC 

CCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCACTACCACCACCAAGGAGTCCCACTCCC 

GCCCCTTCAGCCTGCTCCCCTCAGAGCCCTGCGAGCGGCCCTGGGAGGGCCCCCATACTTGCCCCC 

AGCCCACGGTTGTATACCGGACCGTGTACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGC 

AGTGCTGCCATGGCTTCTATGAGAGCAGGGGGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCC 

ATGGCCGTTGTGTGGCACCCAATCAGTGCCAATGTGTGCCAGGCTGGCGGGGCGACGACTGTTCCA 

GTGAGTGTGCCCCAGGAATGTGGGGGCCACAGTGTGACAAGCCCTGCAGCTGCGGCAACAACAGC 

TCGTGTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTGGTCTGCAGCCCCCGAACTGCCTTCAGC 

CCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTGCCATGGGGCACCCTGCGA 

TCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCAGCTGTGACGTGTCCTGTTC 

CCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATTCTTGCCAAAATGGAGGTGTCTTCCAAACC 

CCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGATGGTATGGAGGGTGGGGCCTGTGGGCATGGGG 

TGTGGGTCTGGGGAGAATTCTGTGGGTGGTGCTAAGCAGGGCTCCAAGGGCACCATCTGCTCCCTG 

CCCTGCCCAGAGGGCTTTCACGGACCCAACTGCTCCCAGGAATGTCGCTGCCACAACGGCGGCCTC 

TGTGACCGATTCACTGGGCAGTGCCGCTGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAG 

TGCCCGGTGGGCCGCTTTGGGCAGGACTGTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGC 

TTCCCGGCCAACGGCGCATGTCTGTGCGAACACGGCTTCACTGGGGACCGCTGCACGGATCGCCTC 

TGCCCCGACGGCTTCTACGGTCTCAGCTGCCAGGCCCCCTGCACCTGCGACCGGGAGCACAGCCTC 

AGCTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGGCGGGCCTCCACTGCAACGA 

GAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCACTGTCTCTGCCTGCACGGTGGCG 

TCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCCTCACTGTGCTAGTC 

TTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTCATGTGAAAATGCCATCGCCTG 

CTCACCCATCGACGGCGAGTGCGTCTGCAAGGAAGGTTGGCAGCGTGGTAACTGCTCTGTGCCCTG 

CCCACCCGGAACCTGGGGCTTCAGTTGCAATGCCAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAG 

CCCCCAAACTGGAGCCTGTACCTGCACCCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCC 

GAAGGGGCAGTTTGGAGAAGGTTGTGCCAGTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCC 

TGTTCATGGACGCTGTCAGTGCCAGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGA 

GGGCTTATGGGGAGTCAACTGTAGCAACACCTGCACCTGCAAGAATGGGGGCACCTGTCTCCCTGA 

GAATGGCAACTGCGTGTGTGCACCCGGATTCCGGGGCCCCTCCTGCCAGAGATCCTGTCAGCCTGG 
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CCGCTATGGCAAACGCTGTGTGCCCTGCAAGTGCGCTAACCACTCCTTCTGCCACCCCTCGAACGG 

GACCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTCCCAGCGCTGCCCTCTGGGGACATT 

TGGTGCTAACTGCTCCCAGCCATGCCAGTGTGGTCCTGGAGAAAAGTGCCACCCAGAGACTGGGGC 

CTGTGTATGTCCCCCAGGGCACAGTGGTGCACCTTGCAGGATTGGAATCCAGGAGCCCTTTACTGT 

GATGCCGACCACTCCAGTAGCGTATAACTCGCTGGGTGCAGTGATTGGCATTGCAGTGCTGGGGTC 

CCTTGTGGTAGCCCTGGTGGCACTGTTCATTGGCTATCGGCACTGGCAAAAAGGCAAGGAGCACCA 

CCACCTGGCTGTGGCTTACAGCAGCGGGCGCCTGGACGGCTCCGAGTATGTCATGCCAGATGTCCC 

TCCCAGCTACAGTCACTACTACTCCAACCCCAGCTACCACACCCTGTCGCAGTGCTCCCCAAACCCC 

CCACCCCCTAACAAGGTTCCAGGCCCGCTCTTTGCCAGCCTGCAGAAACCTGAGCGGCCAGGTGGG 

GCCCAAGGGCATGATAACCACACCACCCTGCCTGCTGACTGGAAGCACCGCCGGGAGCCCCCTCCA 

GGGCCTCTGGACAGGGGGAGCAGCCGCCTGGACCGAAGCTACAGCTATAGCTACAGCAATGGCCC 

AGGCCCATTCTACAATAAAGGGCTCATCTCTGAAGAGGAGCTCGGGGCCAGTGTGGCTTCCCTGAG 

CAGTGAGAACCCATATGCCACCATCCGGGACCTGCCCAGCTTGCCAGGGGGCCCCCGGGAGAGCA 

GCTACATGGAGATGAAAGGCCCTCCCTCAGGATCTCCCCCCAGGCAGCCTCCTCAGTTCTGGGACA 

GCCAGAGGCGGCGGCAACCCCAGCCACAGAGAGACAGTGGCACCTACGAGCAGCCCAGCCCCCTG 

ATCCATGACCGAGACTCTGTGGGCTCCCAGCCCCCTCTGCCTCCGGGCCTACCCCCCGGCCACTATG 

ACTCACCCAAGAACAGCCACATCCCTGGACATTATGACTTGCCTCCAGTACGGCATCCCCCATCAC 

CTCCACTTCGACGCCAGGACCGTTGA 



The disclosed NOV la nucleic acid sequence maps to chromosome 1 and has 1 120 of 1951 
bases (57%) identical to a gb:GENBANK-ID:AB011532|acc:AB01 1532.1 mRNA from Rattus 
norvegicus (Rattus norvegicus mRNA for MEGF6, complete cds). 

The NOV la polypeptide (SEQ ID NO:2) is 1020 amino acid residues in length and is 
presented using the one-letter amino acid code in Table IB. The SignalP, Psort and/or 
Hydropathy results predict that NOV la has a signal peptide and is likely to be localized on the 
plasma membrane with a certainty of 0.6760. In alternative embodiments, a NOVla polypeptide 
is located outside the cell with a certainty of 0.1000, in the endoplasmic reticulum (membrane) 
with a certainty of 0.1000, or in the endoplasmic reticulum (lumen) with a certainty of 0.1000. 
The SignalP predicts a likely cleavage site for a NOVla peptide between amino acid positions 20 
and 21, i.e., at the dash in the sequence AGT-LN. 



Table IB. Encoded NOVla Protein Sequence (SEQ ID NO:2) 

MSPPLCPLLLLAVGLRLAGTLNPSDPNTCSFWESFT1TTKESHSKPFSLLPSEPCERPWEGPHTCPQPTV 
VYRTVYRQVVKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGWRGDDCSSECA 
PGMWGPQCDKPCSCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPACQFRCQCHGAPCDPQTG 
ACFCPAERTGPSCDVSCSQGTSGFFCPSTHSCQNGGVFQTPQGSCSCPPGWMVWRVGPVGMGCGSGE 
NSVGGAKQGSKGTICSLPCPEGFHGPNCSQECRCHNGGLCDRFTGQCRCAPGYTGDRCREECPVGRFG 
QDCAETCDCAPDARCFPANGACLCEHGFTGDRCTDRLCPDGFYGLSCQAPCTCDREHSLSCHPMNGE 
CSCLPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGVCQATSGLCQCAPGYTGPHCASLCPPDTYGVN 
CSARCSCENAIACSPIDGECVCKEGWQRGNCSVPCPPGTWGFSCNASCQCAHEAVCSPQTGACTCTPG 
WHGAHCQLPCPKGQFGEGCASRCDCDHSDGCDPVHGRCQCQAGWMGARCHLSCPEGLWGVNCSNT 
CTCKNGGTCLPENGNCVCAPGFRGPSCQRSCQPGRYGKRCVPCKCANHSFCHPSNGTCYCLAGWTGP 
DCSQRCPLGTFGANCSQPCQCGPGEKCHPETGACVCPPGHSGAPCRIGIQEPFTVMPTTPVAYNSLGAV 
IGIAVLGSLVVALVALFIGYRHWQKGKEHHHLAVAYSSGRLDGSEYVMPDVPPSYSHYYSNPSYHTLS 
QCSPNPPPP NKVPGPLFASLQKPERPGGAQGHDNHTTLPADWKHRREPPPGPLDRGSSRLDRSYSYSYS 
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NGPGPFYNKGLISEEELGASVASLSSENPYATIRDLPSLPGGPRESSYMEMKGPPSGSPPRQPPQFWDSQ 
RRRQPQPQRDSGTYEQPSPLIHDRDSVGSQPPLPPGLPPGHYDSPKNSHIPGHYDLPPVRHPPSPPLRRQ 
DR 



The NOV la amino acid sequence has 834 of 1064 amino acid residues (78%) identical to, 
and 881 of 1064 amino acid residues (82%) similar to, the 1034 amino acid residue 
gi!17386053| gblAAL3857Ll,AF444274 1 (AF444274) Jedi protein [Mus musculus] (E = 0.0). 

Possible small nucleotide polymorphisms (SNPs) found for NOVla are listed in Table 1C. 



Table 1C: SNPs for IN 


fOVla 


Variant 


Nucleotide 
Position 


Base 
Change 


Amino Acid 
Position 


Base 
Change 


13374399 


447 


OT 


NA 


NA 


13374400 


934 


OA 


NA 


NA 


13374401 


975 


G>A 


NA 


NA 


13374402 


984 


OT 


NA 


NA 


13374403 


1011 


T>C 


NA 


NA 


13374404 


1269 


OA 


NA 


NA 


13374405 


1278 


T>C 


NA 


NA 


13374406 


1297 


OT 


433 


His > Tyr 


13374407 


1298 


A>G 


433 


His > Arg 


13374408 


1398 


T>A 


NA 


NA 


13374409 


1585 


A>G 


529 


Ser>Gly 


13374410 


1595 


OT 


532 


Thr>Ile 


1337441 1 


1701 


OT 


NA 


NA 


13374413 


2300 


G>A 


767 


Gly > Asp 


13374414 


2361 


T>C 


NA 


NA 



NOVla is expressed in at least the following tissues: testis. This information was derived 
by determining the tissue sources of the sequences that were included in the invention including 
but not limited to SeqCalling sources, Public EST sources, Literature sources, and/or RACE 
sources. 

NOVlb 

A disclosed NOVlb (designated CuraGen Acc. No. CG57012-01), which includes the 
291 9 nucleotide sequence (SEQ ID NO:3) shown in Table ID. An open reading frame for the 
mature protein was identified beginning with an ATG codon at nucleotides 83-85 and ending with 
a TGA codon at nucleotides 2867-2869. The start and stop codons of the open reading frame are 
highlighted in bold type. Putative untranslated regions are underlined. 
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Table ID. NOVlb Nucleotide Sequence (SEQ ID NO:3) 

AGATCTCTGCAGACAGGTCCTCCAGGCTGCTQGCTGCAGCGCCACTGCCCACTCTGCGCCGGTCTTGCTGCAG 
GCCTCTGCAATGTCACCGCCTCTGTGTCCCCTCCTTCTCCTGGCTGTGGGCCTGCGGCTGGCTGGAACTCTCA 
ACCCCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCACTACCACCACCAAGGAGTCCCACTCCCGCCC 
CTTCAGCCTGCTCCCCTCAGAGCCCTGCGAGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTT 
GTATACCGGACCGTGTACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCT 
ATGAGAGCAGGGGGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGGCACCCAATCA 
GTGCCAATGTGTGCCAGGCTGGCGGGGCGACGACTGTTCCAGTGAGTGTGCCCCAGGAATGTGGGGGCCACAG 
TGTGACAAGCCCTGCAGCTGCGGCAACAACAGCTCGTGTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTG 
GTCTGCAGCCCCCGAACTGCCTTCAGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCA 
GTGCCATGGGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCAGCTGT 
GACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATTCTTGCCAAAATGGAGGTGTCT 
TCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGATGGTATGGAGGGTGGGGCCTGTGGGCATGGG 
GTGTGGGTCTGGGGAGAATTCTGTGGGTGGTGCTAAGCAGGGCTCCAAGGGCACCATCTGCTCCCTGCCCTGC 
CCAGAGGGCTTTCACGGACCCAACTGCTCCCAGGAATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCA 
CTGGGCAGTGCCGCTGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGG 
GCAGGACTGTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAACGGCGCATGTCTGTGC 
GAACACGGCTTCACTGGGGACCGCTGCACGGATCGCCTCTGCCCCGACGGCTTCTACGGTCTCAGCTGCCAGG 
CCCCCCGCACCTGCGACCGGGAGCACAGCCTCAGCTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGG 
CTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCACTGTCTC 
TGCCTGCACGGTGGCGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCCTCACT 
GTGCTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTCATGTGAAAATGCCATCGC 
CTGCTCACCCATCGACGGCGAGTGCGTCTGCAAGGAAGGTTGGCAGCGTGGTAACTGCTCTGTGCCCTGCCCA 
CCCGGAACCTGGGGCTTCAGTTGCAATGCCAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTG 
GAGCCTGTACCTGCACCCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGA 
AGGTTGTGCCAGTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTCATGGACGCTGTCAGTGCCAG 
GCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATGGGGAGTCAACTGTAGCAACACCT 
GCACCTGCAAGAATGGGGGCACCTGTCTCCCTGAGAATGGCAACTGCGTGTGTGCGCCCGGATTCCGGGGCCC 
CTCCTGCCAGAGATCCTGTCAGCCTGGCCGCTATGGCAAACGCTGTGTGCCCTGCAAGTGCGCTAACCACTCC 
TTCTGCCACCCCTCGAACGGGGCCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTCCCAGCCATGCC 
CTCCAGGACACTGGGGAGAAAACTGTGCCCAGACCTGCCAATGTCACCATGGTGGGACCTGCCATCCCCAGGA 
TGGGAGCTGTATCTGCCCCCTAGGCTGGACTGGACACCACTGCTTAGAAGGCTGCCCTCTGGGGACATTTGGT 
GCTAACTGCTCCCAGCCATGCCAGTGTGGTCCTGGAGAAAAGTGCCACCCAGAGACTGGGGCCTGTGTATGTC 
CCCCAGGGCACAGTGGTGCACCTTGCAGGATTGGAATCCAGGAGCCCTTTACTGTGATGCCGACCACTCCAGT 
AGCGTATAACTCGCTGGGTGCAGTGATTGGCATTGCAGTGCTGGGGTCCCTTGTGGTAGCCCTGGTGGCACTG 
TTCATTGGCTATCGGCACTGGCAAAAAGACAAGGAGCACCACCACCTGGCTGTGGCTTACAGCAGCGGGCGCC 
TGGACGGCTCCGAGTATGTCATGCCAGATGTCCCTCCGAGCTACAGTCACTACTACTCCAACCCCAGCTACCA 
CACCCTGTCGCAGTGCTCCCCAAACCCCCCACCCCCTAACAAGGTTCCAGGCCCGCTCTTTGCCAGCCTGCAG 
AACCCTGAGCGGCCAGGTGGGGCCCAAGGGCATGATAACCACACCACCCTGCCTGCTGACTGGAAGCACCGCC 
GGGAGCCCCCTCCAGGGCCTCTGGACAGGGGTAGGTGCCGGGAGGCCAGGGTCTCTGGCGCGGGTGGATGTGT 
GCAGCCCAGATGCCGCGTCTGAGTGTGTGTGTCTGGAGACGGGGGCTCTGGGCCCCATTTCTAGAGGAAGTG 



The disclosed NOVlb nucleic acid sequence maps to chromosome 1 and has 853 of 1409 
bases (60%) identical to a gb:GENBANK-ID:AB01 1532|acc:AB01 1532.1 mRNA from Rattus 
norvegicus (Rattus norvegicus mRNA for MEGF6, complete cds). 

The NOVlb polypeptide (SEQ ID NO:4) is 928 amino acid residues in length and is 
presented using the one-letter amino acid code in Table IE. The SignalP, Psort and/or 
Hydropathy results predict that NOVlb has a signal peptide and is likely to be localized to the 
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plasma membrane with a certainty of 0.6760. In alternative embodiments, a NOV lb polypeptide 
is located to the outside of the cell with a certainty of 0.1000, the endoplasmic reticulum 
(membrane) with a certainty of 0.1000, or the endoplasmic reticulum (lumen) with a certainty of 
0.1000. The SignalP predicts a likely cleavage site for a NOV lb peptide between amino acid 
5 positions 20 and 21, i.e., at the dash in the sequence AGT-LN. 



Table IE. Encoded NOVlb Protein Sequence (SEQ ID NO:4) 

MSPPLCPLLLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCERPWEGPHTCPQPTWYRT 
VYRQWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGWRGDDCSSECAPGMWGPQCDKPC 
SCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPACQFRCQCHGAPCDPQTGACFCPAERTGPSCDVSCSQ 
GTSGFFCPSTHSCQNGGVFQTPQGSCSCPPGWMVWRVGPVGMGCGSGENSVGGAKQGSKGTICSLPCPEGFHGP 
NCS QECRCHNGGL CDRFTGQ CRC APGYTGDRCREEC PVGRFGQD CAETCD CAPD ARCF PANGACL CEHGFTGDR 
CTDRLCPDGFYGLSCQAPRTCDREHSLSCHPMNGECSCLPGWAGLHCNESCPQDTHGPGCQEHCLCLHGGVCQA 
TSGLCQCAPGYTGPHCASLCPPDTYGVNCSARCSCENAIACSPIDGECVCKEGWQRGNCSVPCPPGTWGFSCWA 
SCQCAHEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCDCDHSDGCDPVHGRCQCQAGWMGARCHLS 
CPEGLWGVNCSNTCTCKNGGTCLPENGNCVCAPGFRGPSCQRSCQPGRYGKRCVPCKCANHSFCHPSNGACYCL 
AGWTGPDCSQPCPPGHVJGENCAQTCQCHHGGTCHPQDGSCICPLGWTGHHCLEGCPLGTFGANCSQPCQCGPGE 
KCHPETGACVCPPGHSGAPCRIGIQEPFTVMPTTPVAYNSLGAVIGIAVLGSLWALVALFIGYRHWQKDKEHH 
HLAVAYSSGRLDGSEYVMPDVPPSYSHYYSNPSYHTLSQCSPNPPPPNKVPGPLFASLQNPERPGGAQGHDNHT 
TLPADWKHRREPPPGPLDRGRCREARVSGAGGCVQPRCRV 



The NOVlb amino acid sequence 834 of 1064 amino acid residues (78%) identical to, and 
Q 881 of 1 064 amino acid residues (82%) similar to, the 1034 amino acid residue 
" % gi jl 73860531gblAAL38571 .1 AF444274 j (AF444274) Jedi protein [Mus musculus] (E = 0.0). 

Possible small nucleotide polymorphisms (SNPs) found for NOVla are listed in Table IF. 



Table IF: SNPs for IS 


OVlb 


Variant 


Nucleotide 
Position 


Base 
Change 


Amino Acid 
Position 


Base 
Change 


13374399 


529 


OT 


NA 


NA 


13374400 


1016 


OA 


NA 


NA 


13374401 


1057 


G>A 


NA 


NA 


13374402 


1066 


OT 


NA 


NA 


13374403 


1003 


T>C 


NA 


NA 


13374408 


1480 


T>A 


NA 


NA 


13374409 


1667 


A>G 


529 


Ser > Gly 


13374410 


1677 


OT 


532 


Thr > He 


13374411 


1783 


OT 


NA 


NA 


13374413 


2511 


A>G 


810 


Asp > Gly 


13374414 


2572 


T>C 


NA 


NA 
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NOVlc 

A disclosed NOVlc (designated CuraGen Acc. No. CG57012-02), which includes the 
2919 nucleotide sequence (SEQ ID NO: 5) shown in Table 1G. An open reading frame for the 
mature protein was identified beginning with an ATG codon at nucleotides 83-85 and ending with 
a TGA codon at nucleotides 2867-2869. The start and stop codons of the open reading frame are 
highlighted in bold type. Putative untranslated regions are underlined. 



Table 1G. NOVlc Nucleotide Sequence (SEQ ID NO:5) 

AGATCTCTGCAGACAGG TCCTCCAGGCTGCTGG CTGCAGCGCCACTGCCCACTCTGCGCCGGTCTTGCTGCAG 
GCCTCTGCAA TGTCACCGCCTCTGTGTCCCCTCCTTCTCCTGGCTGTGGGCCTGCGGCTGGCTGGAACTCTCA 
ACCCCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCACTACCACCACCAAGGAGTCCCACTCCCGCCC 
CTTCAGCCTGCTCCCCTCAGAGCCCTGCGAGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTT 
GTATACCGGACCGTGTACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCT 
ATGAGAGCAGGGGGTTCTGTGTCCCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGGCACCCAATCA 
GTGCCAATGTGTGCCAGGCTGGCGGGGCGACGACTGTTCCAGTGAGTGTGCCCCAGGAATGTGGGGGCCACAG 
TGTGACAAGCCCTGCAGCTGCGGCAACAACAGCTCGTGTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTG 
GTCTGCAGCCCCCGAACTGCCTTCAGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCA 
GTGCCATGGGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCAGCTGT 
GACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATTCTTGCCAAAATGGAGGTGTCT 
TCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGATGGTATGGAGGGTGGGGCCTGTGGGCATGGG 
GTGTGGGTCTGGGGAGAATTCTGTGGGTGGTGCTAAGCAGGGCTCCAAGGGCACCATCTGCTCCCTGCCCTGC 
CCAGAGGGCTTTCACGGACCCAACTGCTCCCAGGAATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCA 
CTGGGCAGTGCCGCTGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGG 
GCAGGACTGTGCTGAGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAACGGCGCATGTCTGTGC 
GAACACGGCTTCACTGGGGACCGCTGCACGGATCGCCTCTGCCCCGACGGCTTCTACGGTCTCAGCTGCCAGG 
CCCCCCGCACCTGCGACCGGGAGCACAGCCTCAGCTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGG 
CTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGCGCTGTCTC 
TGCCTGCACGGTGGCGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCCTCACT 
GTGCTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTCATGTGAAAATGCCATCGC 
CTGCTCACCCATCGACGGCGAGTGCGTCTGCAAGGAAGGTTGGCAGCGTGGTAACTGCTCTGTGCCCTGCCCA 
CCCGGAACCTGGGGCTTCAGTTGCAATGCCAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTG 
GAGCCTGTACCTGCACCCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGA 
AGGTTGTGCCAGTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTCATGGACGCTGTCAGTGCCAG 
GCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATGGGGAGTCAACTGTAGCAACACCT 
GCACCTGCAAGAATGGGGGCACCTGTCTCCCTGAGAATGGCAACTGCGTGTGTGCGCCCGGATTCCGGGGCCC 
CTCCTGCCAGAGATCCTGTCAGCCTGGCCGCTATGGCAAACGCTGTGTGCCCTGCAAGTGCGCTAACCACTCC 
TTCTGCCACCCCTCGAACGGGGCCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTCCCAGCCATGCC 
CTCCAGGACACTGGGGAGAAAACTGTGCCCAGACCTGCCAATGTCACCATGGTGGGACCTGCCATCCCCAGGA 
TGGGAGCTGTATCTGCCCCCTAGGCTGGACTGGACACCACTGCTTAGAAGGCTGCCCTCTGGGGACATTTGGT 
GCTAACTGCTCCCAGCCATGCCAGTGTGGTCCTGGAGAAAAGTGCCACCCAGAGACTGGGGCCTGTGTATGTC 
CCCCAGGGCACAGTGGTGCACCTTGCAGGATTGGAATCCAGGAGCCCTTTACTGTGATGCCGACCACTCCAGT 
AGCGTATAACTCGCTGGGTGCAGTGATTGGCATTGCAGTGCTGGGGTCCCTTGTGGTAGCCCTGGTGGCACTG 
TTCATTGGCTATCGGCACTGGCAAAAAGACAAGGAGCACCACCACCTGGCTGTGGCTTACAGCAGCGGGCGCC 
TGGACGGCTCCGAGTATGTCATGCCAGATGTCCCTCCGAGCTACAGTCACTACTACTCCAACCCCAGCTACCA 
CACCCTGTCGCAGTGCTCCCCAAACCCCCCACCCCCTAACAAGGTTCCAGGCCCGCTCTTTGCCAGCCTGCAG 
AACCCTGAGCGGCCAGGTGGGGCCCAAGGGCATGATAACCACACCACCCTGCCTGCTGACTGGAAGCACCGCC 
GGGAGCCCCCTCCAGGGCCTCTGGACAGGGGTAGGTGCCGGGAGGCCAGGGTCTCTGGCGCGGGTGGATGTGT 
GCAGCCCAGATGCCGCGTCTGAG TGTGTGTGTCTGGAGACGGGGGCTCTGGGC CCCATTTCT AGAGGAAGTG 
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The nucleic acid sequence of NOVlc maps to chromosome 1 and has 852 of 1409 bases 
(60%) identical to a gb:GENBANK-ID:AB01 1532|acc:AB01 1532.1 mRNA from Rattus 
norvegicus (Rattus norvegicus mRNA for MEGF6, complete cds). 

The NOVlc polypeptide (SEQ ID NO:6) is 928 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 1H. The SignalP, Psort and/or 
Hydropathy results predict that NOVlc has a signal peptide and is likely to be localized to the 
plasma membrane with a certainty of 0.6760. In alternative embodiments, a NOVlc polypeptide 
is located to the outside of the cell with a certainty of 0.1000, the endoplasmic reticulum 
(membrane) with a certainty of 0.1000, or the endoplasmic reticulum (lumen) with a certainty of 
0.1000. The SignalP predicts a likely cleavage site for a NOVlc peptide between amino acid 
positions 20 and 21, i.e. at the dash in the sequence AGT-LN. 



Table 1H. Encoded NOVlc Protein Sequence (SEQ ID NO:6) 

MSPPLCPLLLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCERPWEGPHTCPQPTWYRT 
VYRQWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGWRGDDCSSECAPGMWGPQCDKPC 
SCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPACQFRCQCHGAPCDPQTGACFCPAERTGPSCDVSCSQ 
GTSGFFCPSTHSCQNGGVFQTPQGSCSCPPGWMVWRVGPVGMGCGSGENSVGGAKQGSKGTICSLPCPEGFHGP 
NCSQECRCHNGGLCDRFTGQCRCAPGYTGDRCREECPVGRFGQDCAETCDCAPDARCFPANGACLCEHGFTGDR 
CTDRLCPDGFYGLSCQAPRTCDREHSLSCHPMNGECSCLPGWAGLHCNESCPQDTHGPGCQERCLCLHGGVCQA 
TSGLCQCAPGYTGPHCASLCPPDTYGVNCSARCSCENAIACSPIDGECVCKEGWQRGNCSVPCPPGTWGFSCNA 
SCQCAHEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCDCDHSDGCDPVHGRCQCQAGWMGARCHLS 
CPEGLWGVNCSNTCTCKNGGTCLPENGMCVCAPGFRGPSCQRSCQPGRYGKRCVPCKCANHSFCHPSNGACYCL 
AGWTGPDCSQPCPPGHWGENCAQTCQCHHGGTCHPQDGSCICPLGWTGHHCLEGCPLGTFGANCSQPCQCGPGE 
KCHPETGACVCPPGHSGAPCRIGIQEPFTVMPTTPVAYNSLGAVIGIAVLGSLWALVALFIGYRHWQKDKEHH 
HLAVAYSSGRLDGSEYVMPDVPPSYSHYYSNPSYHTLSQCSPNPPPPNKVPGPLFASLQNPERPGGAQGHDNHT 
TLPADWKHRREPPPGPLDRGRCREARVSGAGGCVQPRCRV 



The NOVlc amino acid sequence has 834 of 1064 amino acid residues (78%) identical to, 
and 881 of 1064 amino acid residues (82%) similar to, the 1034 amino acid residue 
gill7386053|gblAAL3857LliAF444274 1 (AF444274) Jedi protein [Mus musculus] (E = 0.0). 



NOVld 

A disclosed NOVld (designated CuraGen Acc. No. CG57012-03), which includes the 
5000 nucleotide sequence (SEQ ID NO:7) shown in Table II. An open reading frame for the 
mature protein was identified beginning with an ATG codon at nucleotides 83-85 and ending with 
a TGA codon at nucleotides 3194-3196. The start and stop codons of the open reading frame are 
highlighted in bold type. 
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Table II. NOVld Nucleotide Sequence (SEQ ID NO:7) 



AGATCTCTGCAGACAGGTCCTCCAGGCTGCTGGCTGCAGCGCCACTGCCCACTCTGCGCCGGT CI 

TGCTGCAGGCCTCTGCAATGTCACCGCCTCTGTGTCCCCTCCTTCTCCTGGCTGTGGGCCTGCGG 

CTGGCTGGAACTCTCAACCCCAGTGATCCCAATACCTGCAGCTTCTGGGAAAGCTTCACTACCAC 

CACCAAGGAGTCCCACTCCCGCCCCTTCAGCCTGCTCCCCTCAGAGCCCTGCGAGCGGCCCTGGG 

AGGGCCCCCATACTTGCCCCCAGCCCACGGTTGTATACCGGACCGTGTACCGTCAGGTGGTGAA 

GACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCTATGAGAGCAGGGGGTTCTGTGTC 

CCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGGCACCCAATCAGTGCCAATGTGTGCC 

AGGCTGGCGGGGCGACGACTGTTCCAGTGAGTGTGCCCCAGGAATGTGGGGGCCACAGTGTGAC 

AAGCCCTGCAGCTGCGGCAACAACAGCTCGTGTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTC 

TGGTCTGCAGCCCCCGAACTGCCTTCAGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGT 

TCCGCTGCCAGTGCCATGGGGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGA 

GAGAACTGGGCCCAGCTGTGACGTGTCCTGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCA 

CCCATCCTTGCCAAAATGGAGGTGTCTTCCAAACCCCACAGGGCTCCTGCAGCTGCCCCCCTGGC 

TGGATGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCACGGACCCAACTGCTCCCAGGA 

ATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTGGGCAGTGCCGCTGCGCTCCGGGTT 

ACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGGGCAGGACTGTGCTGAGAC 

GTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAACGGCGCATGTCTGTGCGAACACGGC 

TTCACTGGGGACCGCTGCACGGATCGCCTCTGCCCCGACGGCTTCTACGGTCTCAGCTGCCAGGC 

CCCCCGCACCTGCGACCGGGAGCACAGCCTCAGCTGCCACCCGATGAACGGGGAGTGCTCCTGC 

CTGCCGGGCTGGGCGGGCCTCCACTGCAACGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGT 

GCCAGGAGCACTGTCTCTGCCTGCACGGTGGCGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGC 

GCGCCGGGTTACACGGGCCCTCACTGTGCTAGTCTTTGTCCTCCTGACACCTACGGTGTCAACTG 

TTCTGCACGCTGCTCATGTGAAAATGCCATCGCCTGCTCACCCATCGACGGCGAGTGCGTCTGCA 

AGGAAGGTTGGCAGCGTGGTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGGCTTCAGTTG 

CAATGCCAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTGGAGCCTGTACCTGC 

ACCCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGAAGGTT 

GTGCCAGTCGCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTCATGGACGCTGTCAGTGC 

CAGGCTGGCTGGATGGGTGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATGGGGAGTCAACT 

GTAGCAACACCTGCACCTGCAAGAATGGGGGCACCTGTCTCCCTGAGAATGGCAACTGCGTGTG 

TGCGCCCGGATTCCGGGGCCCCTCCTGCCAGAGATCCTGTCAGCCTGGCCGCTATGGCAAACGCT 

GTGTGCCCTGCAAGTGCGCTAACCACTCCTTCTGCCACCCCTCGAACGGGACCTGCTACTGCCTG 

GCTGGCTGGACAGGCCCCGACTGCTCCCAGCCATGCCCTCCAGGACACTGGGGAGAAAACTGTG 

CCCAGACCTGCCAATGTCACCATGGTGGGACCTGCCATCCCCAGGATGGGAGCTGTATCTGCCC 

CCTAGGCTGGACTGGACACCACTGCTTAGAAGGCTGCCCTCTGGGGACATTTGGTGCTAACTGCT 

CCCAGCCATGCCAGTGTGGTCCTGGAGAAAAGTGCCACCCAGAGACTGGGGCCTGTGTATGTCC 

CCCAGGGCACAGTGGTGCACCTTGCAGGATTGGAATCCAGGAGCCCTTTACTGTGATGCCGACC 

ACTCCAGTAGCGTATAACTCGCTGGGTGCAGTGATTGGCATTGCAGTGCTGGGGTCCCTTGTGGT 

AGCCCTGGTGGCACTGTTCATTGGCTATCGGCACTGGCAAAAAGACAAGGAGCACCACCACCTG 

GCTGTGGCTTACAGCAGCGGGCGCCTGGACGGCTCCGAGTATGTCATGCCAGATGTCCCTCCGA 

GCTACAGTCACTACTACTCCAACCCCAGCTACCACACCCTGTCGCAGTGCTCCCCAAACCCCCCA 

CCCCCTAACAAGGTTCCAGGCCCGCTCTTTGCCAGCCTGCAGAACCCTGAGCGGCCAGGTGGGG 

CCCAAGGGCATGATAACCACACCACCCTGCCTGCTGACTGGAAGCACCGCCGGGAGCCCCCTCC 

AGGGCCTCTGGACAGGGGGAGCAGCCGCCTGGACCGAAGCTACAGCTATAGCTACAGCAATGG 

CCCAGGCCCATTCTACAATAAAGGGCTCATCTCTGAAGAGGAGCTCTGGGCCAGTGTGGCTTCC 

CTGAGCAGTGAGAACCCATATGCCACCATCCGGGACCTGCCCAGCTTGCCAGGGGGCCCCCGGG 

AGAGCAGCTACATGGAGATGAAAGGCCCTCCCTCAGGATCTCCCCCCAGGCAGCCTCCTCAGTT 

CTGGGACAGCCAGAGGCGGCGGCAACCCCAGCCACAGAGAGACAGTGGCACCTACGAGCAGCC 

CAGCCCCCTGATCCATGACCGAGACTCTGTGGGCTCCCAGCCCCCTCTGCCTCCGGGCCTACCCC 

CCGGCCACTATGACTCACCCAAGAACAGCCACATCCCTGGACATTATGACTTGCCTCCAGTACG 

GCA T^rrr A T r A r'PTPP a rTTP.fr A C.GC.C. A GG A CCGTTG AGG AGCC AGG ATGGTATGGCAG AGG 

CCAGCACACCTGGCTGTTGCTGCTCAAGGCTGGGGACAGAGCCTAGTGTACCCCTGCCAGGAGC 

AGGGAGTGGACCGGCAGGCTGTGAACATGAACAACGCTTAACAGAGCAAGTGATGGGAGCCTT 

GTTCCTGGGTTCTACCATGGGAGACGCTGATCAGCAGGATGCCTGGCTCCCTTTCCCAACCCACT 

GCTCCCAAGGCCTCCAGGGCCCTGTGTACATAAACTGGTGGGTTGGAAGTTGCTGGGTAACTCT 
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GATTTCAGACATGCGTGTGGGGTACCTTTTCTGTGCATGCTCAGCCTGGGCTCTGTGCGTGTGTG 

THTTTrTHTr, A TTTTAG A A GGGT ACC AG GC ACAGGTTCTGTCCTAGGGC ACTTACC ATTTAGTAG 

GGAGATGGAACCAACCCAATTAACTCTAGCAATAGCCTCCTAACTGGCCTCCTCCATTGATTCAG 

TGAACCTTCCAATGCATGGCTCATAATTTCAAAATACAGGCTGGTTAGTTACTCCCTACCTGAAA 

GCCTTCATAGGTGCCTCTTTGCTCTTCTGCCAGTATCAAAACTTTTGAAGGCCTTAAAGGCCCTG 

CTTTGCCTGGCCCATCTGTCTCTCCAGCCTCACCTTGAACTGTGTTCCTGTCACTGCACGCCAGTC 

ACACCGGCCTCTAGGTCCTCCTGTAGGCCACTCTTCTTTCTGGCACAGGGACCTGCACACCTGGA 

GTGCCCTTCCTCCCCCACTCGCCTGTTCACCCCTGCTTTTCCTTTACACCTCCTCCTCAGGGAAGT 

GCCCACCCTCCGTACATCTTTCACAGCCCTGATTGCAGCTGTGTTCACTCACCAGGTACCTGCAG 

AAGGCCTACAGGGTGCCAGGCACTTCTTTAATGGGTTCTTTCTTTATGTGATTATTTGATTAATCT 

CTGCCTCCCCCACTAGACTGTAAGCTCCCTGAAGGCAAGAATCCTGTGCTTATGCTCAATATTAG 

CTCTCCCTTGGCACAGAGTAGGCACTCAACAAATGCTCCCCAAAAGGCTGAGTGGCTGACTGAA 

TTAAGTACCAGTGACATGCAGTAACTGCTAAGATAGATGAGCCATCTGTATGCTCTGACAGTTAC 

AGACTGAATAAGTTGGAGACTTCCCTAAAGGGTGGCATTTCCCCAGGGTAACAACGCAGAGCTC 

AGGTGTGGGAAGGTGCCAGGGGCAGGGGTGCAGAGGGGCTGAGGCTGAGGGGGGTGCAGAGG 

CTGGAGAAAGGATAACAGGAGAGAGTATACAGGCATGCCTTGATTTATTGCACTTCACAGGTAG 

CAGAATTTTTAAAGAAATTGAAGGTTTTGGGACATATATGTGACAGCAATAGGTTAAGAAAAGC 

AAAGCAGAGAAATTGAAGATTTGTGTCAACACTGCTTTAAGCAAATCTGTTGGCACCATTTTTCC 

AATAGCATGTGCCCATTTTGGGTCTCTACATTGCATTTTGGTAATTGCTTGCAATATTTCAAGCAT 

TTTCATTGTTATTATATGTGTTATAGTGATCTGTGATCAGTGATCTTTGATATATTATTGTAATTG 

TTTCGGGGCGCCATGAACCGCACCCATATAACACGGTAAACTTAATCAGCAAAAAAAAAAAAA 

AAAAAAAAACCCGGAAAAATTTTAGAATTGAAAAATATGAAAAACCCCCGGGGGGGTCTTTTCA 

GGGGGGGGCGGGGCCCCCAATTTAAATTTTTTTTTTTTTAACAAGG 

AAAAAATCCTCCTGAAAGATTAAATTTGGGGGCC 



The nucleic acid sequence of NOVld has 414 of 421 bases (98%) identical to a 
[I gb:GENBANK-ID:AX071876|acc:AX071876.1 mRNA from Homo sapiens (Sequence 2348 from 
52 Patent WO01 02568). 

S The NOV1 d polypeptide (SEQ ID NO:8) is 1037 amino acid residues in length and is 

presented using the one-letter amino acid code in Table 1 J. The SignalP, Psort and/or Hydropathy 
results predict that NOVld has a signal peptide and is likely to be localized to the plasma 
membrane with a certainty of 0.6760. In alternative embodiments, a NOVld polypeptide is 
located to the outside of the cell with a certainty of 0.1000, the endoplasmic reticulum 

1 0 (membrane) with a certainty of 0. 1 000, or the endoplasmic reticulum (lumen) with a certainty of 
0.1000. The SignalP predicts a likely cleavage site for a NOVld peptide between amino acid 
positions 20 and 21, i.e., at the dash in the sequence AGT-LN. 



Table 1J. Encoded NOVld Protein Sequence (SEQ ID NO:8) 

MSPPLCPLLLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCERPWEGPHTCPQPTV 

VYRTVYRQWKTDHRQRLQCCHGFYESRGFCVPLCAQECVHGRCVAPNQCQCVPGWRGDDCSSEC 

APGMWGPQCDKPCSCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPACQFRCQCHGAPCDPQT 

GACFCPAERTGPSCDVSCSQGTSGFFCPSTHPCQNGGVFQTPQGSCSCPPGWMGTICSLPCPEGFHGPN 

CSQECRCHNGGLCDRFTGQCRCAPGYTGDRCREECPVGRFGQDCAETCDCAPDARCFPANGACLCEH 

GFTGDRCTDRLCPDGFYGLSCQAPRTCDREHSLSCHPMNGECSCLPGWAGLHCNESCPQDTHGPGCQ 

EHCLCLHGGVCQATSGLCQCAPGYTGPHCASLCPPDTYGVNCSARCSCENAIACSPIDGECVCKEGW 

QRGNCSVPCPPGTWGFSCNASCQCAHEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCD 
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CDHSDGCDPVHGRCQCQAGWMGARCHLSCPEGLWGVNCSNTCTCKNGGTCLPENGNCVCAPGFRG 

PSCORSCQPGRYGKRCVPCKCANHSFCHPSNGTCYCLAGWTGPDCSQPCPPGHWGENCAQTCQCHH 

GGTCHPQDGSCICPLGWTGHHCLEGCPLGTFGANCSQPCQCGPGEKCHPETGACVCPPGHSGAPCRIG 

IQEPFTVMPTTPVAYNSLGAVIGIAVLGSLVVALVALFIGYRHWQKDKEHHHLAVAYSSGRLDGSEYV 

MPDVPPSYSHYYSNPSYHTLSQCSPNPPPPNKVPGPLFASLQNPERPGGAQGHDNHTTLPADWKHRRE 

PPPGPLDRGSSRLDRSYSYSYSNGPGPFYNKGLISEEELWASVASLSSENPYATIRDLPSLPGGPRESSY 

MEMKGPPSGSPPRQPPQFWDSQRRRQPQPQRDSGTYEQPSPLIHDRDSVGSQPPLPPGLPPGHYDSPKN 

SHIPGH YDLPP VRHPP SPPLRRQDR 



NOVle 

A disclosed NOVle (designated CuraGen Acc. No. CG57012-04), which includes the 
31 14 nucleotide sequence (SEQ ID NO:9) shown in Table IK. An open reading frame for the 
mature protein was identified beginning with an ATG codon at nucleotides 1-3 and ending with a 
TGA codon at nucleotides 3112-3114 The start and stop codons of the open reading frame are 
highlighted in bold type. 



Table IK. NOVle Nucleotide Sequence (SEQ ID NO:9) 

ATGTCACCGCCTCTGTGTCCCCTCCTTCTCCTGGCTGTGGGCCTGCGGCTGGCTGGAACTCTCAACCCCAGTG 

ATCCCAATACCTGCAGCTTCTGGGAAAGCTTCACTACCACCACCAAGGAGTCCCACTCCCGCCCCTTCAGCCT 

GCTCCCCTCAGAGCCCTGCGAGCGGCCCTGGGAGGGCCCCCATACTTGCCCCCAGCCCACGGTTGTATACCGG 

ACCGTGTACCGTCAGGTGGTGAAGACGGACCACCGCCAGCGCCTGCAGTGCTGCCATGGCTTCTATGAGAGCA 

GGGAGTTCTGTGTCGCGCTCTGTGCCCAGGAGTGTGTCCATGGCCGTTGTGTGGCACCCAATCAGTGCCAATG 

TGTGCCAGGCTGGCGGGGCGACGACTGTTCCAGTGAGTGTGCCCCAGGAATGTGGGGGCCACAGTGTGACAAG 

CCCTGCAGTTGCGGCAACAACAGCTCGTGTGATCCCAAGAGTGGGGTATGTTCTTGCCCTTCTGGTCTGCAGC 

CCCCGAACTGCCTTCAGCCCTGTACCCCTGGCTACTATGGCCCTGCCTGCCAGTTCCGCTGCCAGTGCCATGG 

GGCACCCTGCGATCCCCAGACTGGAGCCTGCTTCTGCCCCGCAGAGAGAACTGGGCCCAGCTGTGACGTGTCC 

TGTTCCCAGGGCACTTCTGGCTTCTTCTGCCCCAGCACCCATCCTTGCCAAAATGGAGGTGTCTTCCAAACCC 

CACAGGGCTCCTGCAGCTGCCCCCCTGGCTGGATGGGCACCATCTGCTCCCTGCCCTGCCCAGAGGGCTTTCA 

CGGACCCAACTGCTCCCAGGAATGTCGCTGCCACAACGGCGGCCTCTGTGACCGATTCACTGGGCAGTGCCGC 

TGCGCTCCGGGTTACACTGGGGATCGGTGCCGGGAGGAGTGCCCGGTGGGCCGCTTTGGGCAGGACTGTGCTG 

AGACGTGCGACTGCGCCCCGGACGCCCGTTGCTTCCCGGCCAAGGGCGCATGTCTGTGCGAACACGGCTTCAC 

TGGGGACCGCTGCACGGATCGCCTCTGCCCCGACGGCTTCTACGGTCTCAGCTGCCAGGCCCCCTGCACCTGC 

GACCGGGAGCACAGCCTCAGCTGCCACCCGATGAACGGGGAGTGCTCCTGCCTGCCGGGCTGGGCGGGCCTCC 

ACTGCAACGAGAGCTGCCCGCAGGACACGCATGGGCCAGGGTGCCAGGAGTACTGTCTCTGCCTGCACGGTGG 

CGTCTGCCAGGCTACCAGCGGCCTCTGTCAGTGCGCGCCGGGTTACACGGGCCCTCACTGTGCTAGTCTTTGT 

CCTCCTGACACCTACGGTGTCAACTGTTCTGCACGCTGCTCATGTGAAAATGCCATCGCCTGCTCACCCATCG 

ACGGCGAGTGCGTCTGCAAGGAAGGTTGGCAGCGTGGTAACTGCTCTGTGCCCTGCCCACCCGGAACCTGGGG 

CTTCAGTTGCAATGCCAGCTGCCAGTGTGCCCATGAGGCAGTCTGCAGCCCCCAAACTGGAGCCTGTACCTGC 

ACCCCTGGGTGGCATGGGGCCCACTGCCAGCTGCCCTGTCCGAAGGGGCAGTTTGGAGAAGGTTGTGCCAGTC 

GCTGTGACTGTGACCACTCTGATGGCTGTGACCCTGTTCATGGACGCTGTCAGTGCCAGGCTGGCTGGATGGG 

TGCCCGCTGCCACCTGTCCTGCCCTGAGGGCTTATGGGGAGTCAACTGTAGCAACACCTGCACCTGCAAGAAT 

GGGGGCACCTGTCTCCCTGAGAATGGCAACTGCGTGTGTGCACCCGGATTCCGGGGCCCCTCCTGCCAGAGAT 

CCTGTCAGCCTGGCCGCTATGGCAAACGCTGTGTGCCCTGCAAGTGCGCTAACCACTCCTTCTGCCACCCCTC 

GAACGGGACCTGCTACTGCCTGGCTGGCTGGACAGGCCCCGACTGCTCCCAGCCATGCCCTCCAGGACACTGG 

GGAGAAAACTGTGCCCAGACCTGCCAATGTCACCATGGTGGGACCTGCCATCCCCAGGATGGGAGCTGTATCT 

GCCCCCTAGGCTGGACTGGACACCACTGCTTAGAAGGCTGCCCTCTGGGGACATTTGGTGCTAACTGCTCCCA 

GCCATGCCAGTGTGGTCCTGGAGAAAAGTGCCACCCAGAGACTGGGGCCTGTGTATGTCCCCCAGGGCACAGT 

GGTGCACCTTGCAGGATTGGAATCCAGGAGCCCTTTACTGTGATGCCGACCACTCCAGTAGCGTATAACTCGC 

TGGGTGCAGTGATTGGCATTGCAGTGCTGGGGTCCCTTGTGGTAGCCCTGGTGGCACTGTTCATTGGCTATCG 

GCACTGGCAAAAAGGCAAGGAGCACCACCACCTGGCTGTGGCTTACAGCAGCGGGCGCCTGGACGGCTCCGAG 

TATGTCATGCCAGATGTCCCTCCGAGCTACAGTCACTACTACTCCAACCCCAGCTACCACACCCTGTCGCAGT 
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GCTCCCCAAACCCCCCACCCCCTAACAAGGTTCCAGGCCCGCTCTTTGCCAGCCTGCAGAACCCTGAGCGGCC 
AGGTGGGGCCCAAGGGCATGATAACCACACCACCCTGCCTGCTGACTGGAAGCACCGCCGGGAGCCCCCTCCA 
GGGCCTCTGGACAGGGGGAGCAGCCACCTGGACCGAAGCTACAGCTATAGCTACAGCAATGGCCCAGGCCCAT 
TCTACGATAAAGGGCTCATCTCTGAAGAGGAGCTCGGGGCCAGTGTGACTTCCCTGAGCAGTGAGAACCCATA 
TGCCACCATCCGGGACCTGCCCAGCTTGCCAGGGGGCCCCCGGGAGAGCAGCTACATGGAGATGAAAGGCCCT 
CCCTCAGGATCTCCCCCCAGGCAGCCTCCTCAGTTCTGGGACAGCCAGAGGCGGCGGCAACCCCAGCCACAGA 
GAGACAGTGGCACCTACGAGCAGCCCAGCCCCCTGATCCATGACCGAGACTCTGTGGGCTCCCAGCCCCCTCT 
GCCTCCGGGCCTACCCCCCGGCCACTATGACTCACCCAAGAACAGCCACATCCCTGGACATTATGACTTGCCT 
CCAGTACGGCATCCCCCATCACCTCCACTTCGACGCCAGGACCGTTGA 



The NOV le polypeptide (SEQ ID NO: 10) is 1037 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 1L. The SignalP, Psort and/or 
Hydropathy results predict that NOVle has a signal peptide and is likely to be localized to the 
plasma membrane with a certainty of 0.6760. In alternative embodiments, a NOVle polypeptide 
is located to the outside of the cell with a certainty of 0.1000, the endoplasmic reticulum 
(membrane) with a certainty of 0.1000, or the endoplasmic reticulum (lumen) with a certainty of 
0.1000. The SignalP predicts a likely cleavage site for a NOVle peptide between amino acid 
positions 20 and 21, i.e., at the dash in the sequence AGT-LN. 



Table 1L. Encoded NOVle Protein Sequence (SEQ ID NO:10) 

MSPPLCPLLLLAVGLRLAGTLNPSDPNTCSFWESFTTTTKESHSRPFSLLPSEPCERPWEGPHTCPQPTV 

VYRTVYRQWKTDHRQRLQCCHGFYESREFCVPLCAQECVHGRCVAPNQCQCVPGWRGDDCSSECA 

PGMWGPQCDKPCSCGNNSSCDPKSGVCSCPSGLQPPNCLQPCTPGYYGPACQFRCQCHGAPCDPQTG 

ACFCPAERTGPSCDVSCSQGTSGFFCPSTHPCQNGGVTQTPQGSCSCPPGWMGTICSLPCPEGFHGPNC 

SQECRCHNGGLCDRFTGQCRCAPGYTGDRCREECPVGRFGQDCAETCDCAPDARCFPANGACLCEH 

GFTGDRCTDRLCPDGFYGLSCQAPCTCDREHSLSCHPMNGECSCLPGWAGLHCNESCPQDTHGPGCQ 

EYCLCLHGGVCQATSGLCQCAPGYTGPHCASLCPPDTYGVNCSARCSCENAIACSPIDGECVCKEGW 

QRGNCSVPCPPGTWGFSCNASCQCAHEAVCSPQTGACTCTPGWHGAHCQLPCPKGQFGEGCASRCD 

CDHSDGCDPVHGRCQCQAGWMGARCHLSCPEGLWGVNCSNTCTCKNGGTCLPENGNCVCAPGFRG 

PSCQRSCQPGRYGKRCVPCKCANHSFCHPSNGTCYCLAGWTGPDCSQPCPPGHWGENCAQTCQCHH 

GGTCHPQDGSCICPLGWTGHHCLEGCPLGTFGANCSQPCQCGPGEKCHPETGACVCPPGHSGAPCRIG 

IQEPFTVMPTTPVAYNSLGAVIGIAVLGSLVVALVALFIGYRHWQKGKEHHHLAVAYSSGRLDGSEYV 

MPDVPPSYSHYYSNPSYHTLSQCSPNPPPPNKVPGPLFASLQNPERPGGAQGHDNHTTLPADWKHRRE 

PPPGPLDRGSSHLDRSYSYSYSNGPGPFYDKGLISEEELGASVTSLSSENPYATIRDLPSLPGGPRESSYM 

EMKGPPSGSPPRQPPQFWDSQRRRQPQPQRDSGTYEQPSPLIHDRDSVGSQPPLPPGLPPGHYDSPKNS 

HIPGHYDLPPVRHPPSPPLRRQDR 



One or more consensus positions (Cons. Pos.) of the nucleotide sequence have been 
identified as SNPs as shown in Table 1M. "Depth" represents the number of clones covering the 
region of the SNP. The Putative Allele Frequency (Putative Allele Freq.) is the fraction of all the 
clones containing the SNP. A dash ("-"), when shown, means that a base is not present. The sign 



">" means "is changed to". 









Table 1M. 


SNPs of NOVle 


Cons.Pos, 


Depth 


Change 


Putative 


Fragment Listing 
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Allele 
Freq. 




2716 


10 


G > A 


0.200 


163608053(-,i,l 19650936) Fpos: 482 
1 6361 0839(-,i,l 19650936) Fpos: 485 


2758 


9 


G > A 


0.333 


172614573(+,i,-l) Fpos: 132 
172614575(+,i,-l) Fpos: 148 
172614579(+,i,-l) Fpos: 146 



The NOV1 amino acid sequence has 834 of 1064 amino acid residues (78%) identical to, 
and 881 of 1064 amino acid residues (82%) similar to, the 1034 amino acid residue 
gill7386053|gb|AAL38571.yAF444274 1 (AF444274) Jedi protein [Mus musculus] (E = 0.0). 
5 NOV lb, NOVlc and NOV Id are expressed in at least the following tissues: adrenal gland, 

O bone marrow, brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, 
f£ brain - thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, 
fU lymphoma - Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary gland, 
j~ skeletal muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea and uterus. 
HO NOVle is expressed in at least the following tissues: adipose, heart, aorta, umbilical vein, 

XL 

Q pancreas, parathyroid gland, thyroid, stomach, liver, colon, bone marrow, peripheral blood, bone, 

jbsL". 

y, cartilage, synovium/synovial membrane, brain, thalamus, cervix, placenta, amnion, vulva, testis, 

lung, kidney, skin, epidermis and dermis. Expression information was derived from the tissue 
FU sources of the sequences that were included in the derivation of each of the sequences of NOV1 . 
15 NOV la, NOV lb, NOVlc, NOV Id and NOVle are very closely homologous as is shown 

in the amino acid alignment in Table IN. 



Table IN. Amino Acid Alignment of NOVla, NOVlb, NOVlc, NOVld and NOVle 



10 20 30 40 50 

....|....|....|....|....|....|....|....|....|....| 

COR87920446_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 

60 70 80 90 100 

....|....|....|....|....|....|....|....|....|....| 

COR8792 0446_A 
CG57012-01 
CG57012-02 
CG57012-03 

CG57012-04 E 

110 120 130 140 150 



COR87920446_A 
CG57012-01 
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CG57012-02 
CG57012-03 
CG57012-04 

160 170 180 190 200 



COR87920446_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



210 220 230 240 250 



COR87920446_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



260 270 280 290 300 



COR8792044 6_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



310 320 330 340 350 



COR8792 0446__A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



360 370 380 390 400 



COR87920446_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



410 420 430 440 450 



COR87920446_A 
CG57012-01 

CG57012-02 R 
CG57012-03 

CG57012-04 Y 



460 470 480 490 500 



COR87920446_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



510 520 530 540 550 



COR87920446_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



560 570 580 590 600 



COR87920446_A 
CG57012-01 
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CG57012-02 
CG57012-03 
CG57012-04 



610 620 630 640 650 



COR87920446_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



660 670 680 690 700 
• ■ I I i 1 I I I I I 



COR8792044 6_A 

CG57012-01 A 
CG57012-02 A 
CG57012-03 
CG57012-04 



710 720 730 740 750 

! I I i I I I I i I 

COR87920446_A 

CG57012-01 
CG57012-02 
CG57012-03 
CG57012-04 



760 770 780 790 800 



COR8792044 6_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



310 820 830 840 850 



COR87920446_A G 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 G 



860 870 880 890 900 



COR87920446_ 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



910 920 930 940 950 

— — I — I — ! — i — | .... | .... i — | 

COR87920446_A N G A 

CG57012-01 RC EA VS A GCVQ 

CG57012-02 RC EA VS A GCVQ 

CG57012-03 N W A 

CG57012-04 H D G T 



960 970 980 990 1000 



COR87920446_A 

CG5 7012-01 RCRV- 

CG57012-02 RCRV- 

CG57012-03 

CG57012-04 



1010 1020 1030 1040 1050 



COR87920446_A 
CG57012-01 
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CG57012 -02 
CG57012-03 
CG57012-04 



1060 



COR87920446_A 

CG57012-01 

CG57012-02 

CG57012-03 

CG57012-04 



Homologies to any of the above NOV1 proteins will be shared by the other NOV1 
proteins insofar as they are homologous to each other as shown above. Any reference to NOV1 is 
assumed to refer to both of the NOV1 proteins in general, unless otherwise noted. 

NOV1 also has homology to the amino acid sequences shown in the BLASTP data listed 
in Table 10. 



Table lO. BLAST results for NOV1 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


qi j 17336053 j gb ' AA 
1,38571 . 1 !AF444274 
i 

"(AF444274) 


Jedi protein 
[Mus 
tnusculus] 


1034 


834/1064 
(78%) 


881/1064 
(82%) 


0.0 


gi | I70i72^gbjAAL3 
' 3583"," 1 AF440279V 
(AF440279) 


MEGF12 [Mus 
musculus] 


1034 


836/1064 
(78%) 


882/1064 
(82%) 


0.0 


qi ; 14192943 ref NP 
115822. ll 
(NM 032446) 


MEGF10 
protein [Homo 
sapiens] 


1140 


349/713 
(48%) 


422/713 
(58%) 


e-163 


qi | 14724016 ref XP 
03C163.ll 
(XM 030163) 


MEGF10 

protein [Homo 
sapiens] 


1140 


349/713 
(48%) 


422/713 
(58%) 


e-163 


gi| 14017777 dbj BAB 
474C9.1 (AB058676) 


MEGF10 
protein [Homo 
sapiens] 


1140 


349/713 
(48%) 


422/713 
(58%) 


e-163 



The homology of these sequences is shown graphically in the ClustalW analysis shown in 
Table IP. 

Table IP. ClustalW Analysis of NOV1 

1) MOVla (SEQ ID NO: 2) 

2) NOVlb (SEQ ID NO : 4 ) 

3) NOVlC (SEQ ID NO: 6) 

4) NOVle (SEQ ID NO: 10) 

5) NOVld (SEQ ID NO: 8) 

6) gi 1 173 86053 j gb [ AAL3 85 71 . 1 j AF4442 74 1 (AF444274) Jedi protein [Mus musculus] 
(SEQ ID NO:31) 

25 



7) gi 70:725 1 qb AAL3358 i . 1 | A ?44C273 1 (AF440279) MEGF12 [Mus musculus] (SEQ ID 
NO: 32) 

8) gij if, 192943 ;_ref ,N? 1 15822.1 : (NMJ)32446) MEGF10 protein [Homo sapiens] (SEQ ID 

(NM 032445) MEGF11 protein [Homo sapiens] (SEQ ID 



NO: 33) 

9) gi. 14192941 ref _NP_11532i . 1 



NO: 34) 

10) gi 16151114 j ref jX?__05q906. 2 | 
NO: 35) 



(XM 050906) MEGF11 protein [Homo sapiens] (SEQ ID 
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NOVla COR87920446_A 
NOVlb CG57012-01 
NOVlc CG57012-02 
NOVle CG57012-04 
NOVld CG57012-03 
gi| 17386053 [ 
gij 1701725l| 
gi j 14192943 j 
gi|l419294l| 
gi | 16161114 | 



NOVla COR87920446_A 
NOVlb CG57012-01 
NOVlc CG57012-02 
NOVle CG57012-04 
NOVld CG57012-03 
gi|l7386053 
gi | 17017251 
gi | 14192943 
gij 14192941 
gij 16161114 



NOVla COR8792 0446_A 
NOVlb CG57012-01 
NOVlc CG57012-02 
NOVle CG57012-04 
NOVld CG57012-03 
gi|l7386053| 
gijl701725lj 
gi | 14192943 j 
gij 14192941 1 
gi j 16161114 j 



NOVla COR87920446_A 
NOVlb CG57012-01 
NOVlc CG57012-02 
NOVle CG57012-04 
NOVld CG57012-03 
gi|l7386053| 
gi | 17017251 j 
gijl4192943 j 
gij 14192941 j 
gij 16161114 j 



NOVla COR8792044 6_A 
NOVlb CG57012-01 
NOVlc CG57012-02 
NOVle CG57012-04 
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A sequence of about thirty to forty amino-acid residues long found in the sequence of 
epidermal growth factor (EGF) has been shown to be present, in a more or less conserved form, in 
a large number of other, mostly animal proteins. The list of proteins currently known to contain 
one or more copies of an EGF-like pattern is large and varied. The functional significance of EGF 
domains in what appear to be unrelated proteins is not yet clear. However, a common feature is 
that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins 
known to be secreted (exception: prostaglandin G/H synthase). The EGF domain includes six 
cysteine residues which have been shown (in EGF) to be involved in disulfide bonds. The main 
structure is a two-stranded beta-sheet followed by a loop to a C -terminal short two-stranded sheet. 
Subdomains between the conserved cysteines vary in length. 

Ligands of the Delta/Serrate/lag-2 (DSL) family and their receptors, members of the lin- 

12/Notch family, mediate cell-cell interactions that specify cell fate in invertebrates and 

vertebrates. In C. elegans, two DSL genes, lag-2 and apx-1, influence different cell fate decisions 

during development. Molecular interaction between Notch and Serrate, another EGF-homologous 

transmembrane protein containing a region of striking similarity to Delta, has been shown and the 

same two EGF repeats of Notch may also constitute a Serrate binding domain. 
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The Notch signaling pathway is a conserved intercellular signaling mechanism that is 
essential for proper embryonic development in numerous metazoan organisms. Members of the 
Notch gene family encode transmembrane receptors that are critical for various cell fate 
decisions. Multiple ligands that activate Notch and related receptors have been identified, 
including Serrate and Delta in Drosophila and JAG1 in vertebrates. By searching for human brain 
expressed sequence tags (ESTs) homologous to Serrate and Delta, (Luo el al. (1997) Molec. Cell. 
Biol. 17: 6057-6067) identified a cDNA which they called Jagged-2 (JAG2). The predicted 1,238- 
amino acid JAG2 protein has several recognizable motifs, including a signal peptide, 16 EGF-like 
repeats, a transmembrane domain, and a short cytoplasmic domain. The amino acid sequence of 
human JAG2 is 89% identical to that of rat Jag2. Northern blot analysis and in situ hybridization 
showed expression of Jag2 in various murine tissues. Immunohistochemistry revealed 
coexpression of Jag2 and Notchl within murine fetal thymus and other murine fetal tissues. 
Coculture of fibroblasts expressing human JAG2 with murine C2C12 myoblasts inhibited 
myogenic differentiation. This effect was simulated by expression of constitutively active Notchl, 
suggesting that JAG2 engages the Notchl pathway of signal transduction. 

Jiang etal. (1998) {Genes Dev. 12: 1046-1057) examined the in vivo role of the Jag2 gene 
by making a targeted mutation that removed a domain of the Jagged-2 protein required for 
receptor interaction. Mice homozygous for this deletion died perinatally because of defects in 
craniofacial morphogenesis. The mutant homozygotes exhibited cleft palate and fusion of the 
tongue with the palatal shelves. They also exhibited syndactyly of the fore- and hindlimbs. The 
apical ectodermal ridge (AER) of the limb buds of the mutant homozygotes was hyperplastic, and 
Jiang et al. (1998) {Genes Dev. 12: 1046-1057) observed an expanded domain of Fgf8 expression 
in the AER. hi the foot plates of the mutant homozygotes, both Bmp2 and Bmp7 expression and 
apoptotic interdigital cell death were reduced. Mutant homozygotes also displayed defects in 
thymic development, exhibiting altered thymic morphology and impaired differentiation of T cells 
of the gamma/delta lineage. These results demonstrated that Notch signaling mediated by Jag2 
plays an essential role during limb, craniofacial, and thymic development in mice. 

Lanford etal. (1999) (Nature Genet. 21: 289-292) showed that the genes encoding the 
receptor protein Notchl and its ligand, Jag2, are expressed in alternating cell types in the 
developing sensory epithelium of the mammalian cochlea (the organ of Corti). The sensory 
epithelium contains 4 rows of mechanosensory hair cells: a single row of inner hair cells and 3 
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rows of outer hair cells. Each hair cell is separated from the next by an interceding supporting 
cell, forming an invariant and alternating mosaic that extends the length of the cochlear duct. 
Previous results had suggested that determination of cell fates in the cochlear mosaic occurs via 
inhibitory interactions between adjacent progenitor cells. Cells populating the cochlear epithelium 
appear to constitute a developmental equivalence group in which developing hair cells suppress 
differentiation in their immediate neighbors through lateral inhibition. Lanfordetal. (1999) 
(Nature Genet. 21 : 289-292) also found that genetic deletion of Jag2 results in a significant 
increase in sensory hair cells, presumably as the result of a decrease in Notch activation. These 
results provided direct evidence for Notch-mediated lateral inhibition in a mammalian system and 
supported a role for Notch in the development of the cochlear mosaic. 

The protein similarity information, expression pattern, and map location for the NOV1 
proteins and nucleic acids disclosed herein suggest that it may have important structural and/or 
physiological functions characteristic of the family. Therefore, the nucleic acids and proteins of 
the invention are useful in potential diagnostic and therapeutic applications and as a research tool. 
These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed, as well as potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) biological 
defense weapon. 

The nucleic acids and proteins of the invention are useful in potential diagnostic and 
therapeutic applications implicated in various diseases and disorders described below and/or other 
pathologies. For example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from: cardiovascular disease, Alagille syndrome, neural 
development defects, other developmental defects and other diseases, disorders and conditions of 
the like. 

NOV2 

A disclosed NOV2 nucleic acid (designated as CuraGen Acc. No. COR87940554), which 

encodes a novel secretin receptor precursor-like protein includes the 1833 nucleotide sequence 

(SEQ ID NO: 1 1) shown in Table 2A. An open reading frame for the mature protein was 
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identified beginning with an ATG codon at nucleotides 74-76 and ending with a TGA codon at 
nucleotides 1745-1 747. Putative untranslated regions are underlined in Table 2A, and the start 
and stop codons are in bold letters. 



Table 2 A. NOV2 Nucleotide Sequence (SEQ ID NO: 11) 

AGCGAGTCCQTCTGTCAGGCCGCCTCCTCTCCGGCCGTCTGATTTTCTAGCCTTCGGCGCCCTGCTCTTCCTCAT 
GTTGGCATCCCCGGCCACGGAGACCACCGTCCTCATGTCCCAGACTGAGGCCGACCTGGCCCTGCGGCCCCCGCC 
TCCTCTTGGCACCGCGGGGCAGCCCCGCCTCGGGCCCCCTCCTCGCCGAGCGCGCCGCTTCTCCGGGAAGGCTGA 
GCCCCGGCCGCGCTCTTCGAGACCTAGCCGCCGCAGCTCAGTCGATCTGGGACTGCTGAGCTCCTGGTCTCAACC 
AGCCTCACTCCTTCCGGAACCCCCGGATCCTCCAGACTCCGCTGGCCCCACGAGGAGCCCACCTTCAAGCTCTAA 
AGAACCCCCCGAGGGCACATGGATGGGGGCAGCTCCCGTGAAGGCTGTGGACTCTGCATGTCCTGAGCTTACGGG 
ATCTTCAGGGGGCCCGGGGTCCAGGGAGCCGCTAAGGGTCCCTGAAGCTGTGGCCCTAGAGCGGCGGCGGGAGCA 
GGAAGAAAAGGAGGACATGGAGACCCAGGCTGTGGCAACGTCCCCCGATGGCCGATACCTCAAGTTTGACATCGA 
GATTGGACGTGGCTCCTTCAAGACGGTGTATCGAGGGCTAGACACCGACACCACAGTGGAGGTGGCCTGGTGTGA 
GCTGCAGACTCGGAAACTGTCTAGAGCTGAGCGGCAGCGCTTCTCAGAGGAGGTGGAGATGCTCAAGGGGCTGCA 
GCACCCCAACATCGTCCGCTTCTATGATTCGTGGAAGTCGGTGCTGAGGGGCCAGGTTTGCATCGTGCTGGTCAC 
CGAACTCATGACCTCGGGCACGCTCAAGACGTACCTGAGGCGGTTCCGGGAGATGAAGCCGCGGGTCCTTCAGCG 
CTGGAGCCGCCAAATCCTGCGGGGACTTCATTTCCTACACTCCCGGGTTCCTCCCATCCTGCACCGGGATCTCAA 
GTGCGACAATGTCTTTATCACGGGACCTACTGGCTCTGTCAAAATCGGGGACCTGGGCCTGGCCACGCTCAAGCG 
CGCCTCCTTTGCCAAGAGTGTCATCGGGACCCCGGAATTCATGGCCCCCGAGATGTACGAGGAAAAGTACGATGA 
GGCCGTGGACGTGTACGCGTTCGGCATGTGCATGCTGGAGATGGCCACCTCTGAGTACCCGTACTCCGAGTGCCA 
GAATGCCGCGCAAATCTACCGCAAGGTCACTTCGGGCAGAAAGCCGAACAGCTTCCACAAGGTGAAGATACCCGA 
GGTGAAGGAGATCATTGAAGGCTGCATCCGCACGGATAAGAACGAGAGGTTCACCATCCAGGACCTCCTGGCCCA 
CGCCTTCTTCCGCGAGGAGCGCGGTGTGCACGTGGAACTAGCGGAGGAGGACGACGGCGAGAAGCCGGGCCTCAA 
GCTCTGGCTGCGCATGGAGGACGCGCGGCGCGGGGGGCGCCCACGGGACAACCAGGCCATCGAGTTCCTGTTCCA 
GCTGGGCCGGGACGCGGCCGAGGAGGTGGCACAGGAGATGGTGGCTCTGGGCTTGGTCTGTGAAGCCGATTACCA 
GCCAGTGGCCCGTGCAGTACGTGAACGGGTTGCTGCCATCCAGCGAAAGCGTGAGAAGCTGCGTAAAGCAAGGGA 
ATTGGAGGCACTCCCACCAGAGCCAGGACCTCCACCAGCAACTGTGCCCATGGACCCCGGTCCACCAACAGATGT 
CTATC C AC C CC ATGAG AC CTG AGG AGC AAGAGGC AAGAC C AGAAC ACAGC AC CTTCC TTATTACAGA CACGCCAA 
GCTACTCATCTACCACTTCGGATTGCGAGACTG 



The nucleic acid sequence of NOV2 maps to chromosome 17 and has 1025 of 1464 bases 
(70%) identical to a gb:GENBANK-ID:AB044546(acc:AB044546.1 mRNA from Homo sapiens 
(Homo sapiens P/OKcl.13 mRNA for mitogen-activated protein kinase kinase kinase, partial cds). 

The NOV2 polypeptide (SEQ ID NO: 12) is 557 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 2B. The SignalP, Psort and/or 
Hydropathy results predict that NOV2 is likely to be localized in the nucleus with a certainty of 
0.6000. In alternative embodiments, a NOV2 polypeptide is located in the mitochondrial matrix 
space with a certainty of 0.3600 or the lysosome (lumen) with a certainty of 0.1000. 



Table 2B. Encoded NOV2 Protein Sequence (SEQ ID NO:12) 

MLASPATETTVLMSQTEADLALRPPPPLGTAGQPRLGPPPRRARRFSGKAEPRPRSSRPSRRSSVDLGLLSSWS 
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QPASLLPEPPDPPDSAGPTRSPPSSSKEPPEGTWMGAAPVKAVDSACPELTGSSGGPGSREPLRVPEAVALERR 
REQEEKEDMETQAVATSPDGRYLKFDIEIGRGSFKTVYRGLDTDTTVEVAWCELQTRKLSRAERQRFSEEVEML 
KGLQHPNIVRFYDSWKSVLRGQVCIVLVTELMTSGTLKTYLRRFREMKPRVLQRWSRQILRGLHFLHSRVPPIL 
HRDLKCDNVFITGPTGSVKIGDLGLATLKRASFAKSVIGTPEFMAPEMYEEKYDEAVDVYAFGMCMLEMATSEY 
PYSECQNAAQIYRKVTSGRKPNSFHKVKIPEVKEIIEGCIRTDKNERFTIQDLLAHAFFREERGVHVELAEEDD 
GEKPGLKLWLRMEDARRGGRPRDNQAIEFLFQLGRDAAEEVAQEMVALGLVCEADYQPVARAVRERVAAIQRKR 
EKLRKARELEALPPEPGPPPATVPMDPGPPTDVYPPHET . 



The N0V2 amino acid sequence to 521 of 552 amino acid residues (94%) identical to, and 
524 of 552 amino acid residues (94%) similar to, the 1243 amino acid residue 
gill5212448jghiAAK91995.1 AF390018 1 (AF390018) protein from Homo sapiens (PUTATIVE 
PROTEIN KINASE WNK4) (E = 0.0). 

NOV2 is expressed in at least the following: blood, lymphocyte, breast, tonsil, colon, 
lymph, stomach, adrenal gland, kidney, testis, lung. 

NOV2 also has homology to the amino acid sequences shown in the BLASTP data listed 

in Table 2C. 



Table 2C. BLAST results for NOV2 




Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


qi 115212448 qb | AAK9 


putative 
protein 
kinase WNK4 
[Homo 
sapiens] 


1243 


521/552 
(94%) 


524/552 
(94%) 


0.0 


1995.1 AF39001S 1 
(AF390018) 


qi | 15277312 ref NP 


putative 
protein 
kinase WNK4 
[Homo 
sapiens] 


1231 


509/540 
(94%) 


512/540 
(94%) 


0.0 


115763. 1| 
(NM_032387) 


qi! 15131540 emb . CAC 


serine/threon 
ine protein 
kinase [Homo 
sapiens] 


1231 


509/540 
(94%) 


512/540 
(94%) 


0.0 


48_387 : I_ (AJ316534) 


qi | 6933 864 ! qb | AAF3 1 


kinase 
deficient 
protein KDP 
[Homo 
sapiens] 


670 


309/479 
(64%) 


372/479 
(77%) 


e-159 


483_.1._ (AF061944) 


qij[l6753634 .ref .NP_ 
446246 .1 | 
(NMJ353794) 


protein 
kinase, 
lysine 
deficient 1 

[Rattus 
norvegicus] 


2126 


304/476 
(63%) 


363/476 
(75%) 


e-153 


qi;8272557|qb|AAF74 


protein 
kinase WNK1 

[Rattus 
norvegicus] 


2126 


304/476 
(63%) 


363/476 
(75%) 


e-153 


258 .1 |AF227741 1 
(AF227741) 
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gj 127i:*>6C ref NP 


protein 
kinase , 
lys ine 
deficient 1; 
kinase 
deficient 


2382 


309/479 
(64%) 


372/479 
(77%) 


e-153 


0S1S52 j 
(MM 018979) 


gi" 11125348 erab CAC 
15059_.1_ (AJ296290) 


putative 
protein 
kinase [Homo 
sapiens] 


2382 


309/479 
(64%) 


372/479 
(77%) 


e-153 



The homology of these sequences is shown graphically in the ClustalW analysis shown 
Table 2D. 

Table 2D. ClustalW Analysis for NOV2 

DNOV2 (SEQ ID NO: 12) 

2) gi j 15212448 • gb i AAK9 1995 . 1 1 AF390018 1 (AF390018) putative protein kinase WNK4 
[Homo sapiens] (SEQ ID NO : 36) 

3) gi[l5277312 j ref jNP 115763 . 1 ; (NM_0323 87) putative protein kinase WNK4 [Homo 
sapiens] (SEQ ID NO: 37) 

4) g i 1 693386 4 | gbi AAF314 83 . 1 
{SEQ ID NO: 38) 

5) gij!6758634 ref NP 446246.1 



[Rattus norvegicus] (SEQ ID NO 
6) gill271166Q.ref NP 061852.1 



(AF061944) kinase deficient protein KDP [Homo sapiens] 
(NM_053794) protein kinase, lysine deficient 1 



39) 



(NM_018979) protein kinase, lysine deficient 1; 
kinase deficient protein [Homo sapiens] (SEQ ID NO: 40) 



10 



20 



30 



NOV2 COR87 94 0554 MLASPATETTVLM Q EAD A R P LGTA QP- 



gi| 15212448 | 
gi | 15277312 | 
gi|6933864 | 
gi|l6758634] 
gi|l271166oj 



MLASPATETTVLM Q EAD A R P LGTA QP- 
M q EAD A R P LGTA QP- 



40 



50 



-R PPPR-- 
-R PPPR-- 



-R PPPR- 



-MSGGAAEKQ S PGS F S 
-MSDGTAEKQ G PG--F S 
-MSGGAAEKQ S PGS F S 



A APKN SSSDSSVGEK AAAADA 
A VPKN SSSDSSVGEK AAVADS 
A APKN SSSDSSVGEK AAAADA 



60 



70 



80 



90 



100 



NOV2 COR8794 0554 
gi 1 15212448 | 
gij 15277312| 
gi | 6933864 | 
gi 1 16758634 | 
gij 12711660 | 



A FSGKAEP RSS PS 



VTGRTEEY 
GIGRTEEY 
VTGRTEEY 



A 
A 
R 
R 
R 

110 



FSGKAEP RSS LS 

FSGKAEP RSS LS 

HTMDKDS GAAATTTTTEH FF 
HTMDKDS GAAATTT TEH FF 
HTMDKDS GAAATTTTTEH FF 



SVDLGLLSSWSQ 

SVDLGLLSSWS 

SVDLGLLSSWS 

VICDSNATALE 

VICDSNATALE 

VICDSNATALE 



120 



130 



140 



150 



NOV2 COR87940554 ASL 

gi | 15212448 | AS A-- 

gij 15277312| AS A- - 

gij 6933864 | GL SL 

gij 16758634 [ GL SI 

gi | 12711660 | GL SL 



E PD - 



D G- 



T SP- 



SSSKE P G 



D PD 

D PD 

Q SI AAV 
Q SV AW 
Q SI AAV 

160 



G 
G 
P 
P 
P 

170 



A SP SSKE 

A SP SSKE 

H EETVTATATSQVAQQ AAAA 
H EETLTATVASQVSQQ SAAAS 



E H EETVTATATSQVAQQ AAAA 



180 



190 



G 
G 
Q 
Q 
Q 

200 



NOV2 COR87940554 TWM A 
gi | 15212448 | TWTEG 
gij 15277312 | TWTEG 



KAVD-SAC ELTG SG 
KAAEDSA ELPD A 
KAAEDSA ELPD A 



G- EPLR VPEAVA 

G- EPLR VPEAVA 

G- EPLR VPEAVA 
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gi|6933864 | 
gi i 16758634 ( 
gi j 12711660 j 



AVA P ST PSSTSKD VSQP L SKEE PPA SGSG- -GGSAKEPQ 
AW S TAT PSSTSKD VSQP L SKEE PP SGSGSGGASAKEPQ 
AVA P ST PSSTSKD VSQP L SKEE PPA SGSG- -GGSAKEPQ 



210 220 230 240 250 



NOV2 COR87940S54 


L 


RRE 


EEK DM 


Q 


AT P 


Y 


R 


D 


gi|l5212448| 


L 


RRE 


EEK DM 


Q 


AT P 


Y 


R 


D 


gij 15277312 j 


L 


RRE 


EEK DM 


Q 


AT P 


Y 


R 


D 


gi|6933864j 


E 


SQQ 


DDI EL 


K 


GM N 


F 


K 


E 


gi [16758634 | 


E 


NQQ 


DDI EL 


K 


GM N 


F 


K 


E 


gi | 12711660 | 


E 


SQQ 


DDI EL 


K 


GM N 


F 


K 


E 



260 270 280 290 300 



NOV2 COR8794 05 54 
gi|l5212448| 
gi|l5277312| 
gi [ 6933864 | 
gi|l6758634| 
gi j 12711660 j 



T 


SRA 


S 


V 


K 


VLR 


Q 


T 


SRA 


S 


V 


K 


VLR 


Q 


T 


SRA 


s 


V 


K 


VLR 


Q 


D 


TKS 


K 


A 


E 


TVK 


K 


D 


TKS 


K 


A 


E 


TVK 


K 


D 


TKS 


K 


A 


E 


TVK 


K 



310 320 330 340 350 



I — I • • • ■ I — — I — I — I — I — I 

R RE PR QR S R H S V 



NOV2 COR87940554 


V 


R 


RE 


PR 


QR 


S 


R 


H 


s 


V 


gi|l5212448| 


V 


R 


RE 


PR 


QR 


S 


R 


H 


s 


V 


gi j 15277312 | 


V 


R 


RE 


PR 


QR 


S 


R 


H 


s 


V 


gi|6933B64| 


K 


K 


KV 


IK 


RS 


C 


K 


Q 


T 


T 


gi j 16758634 | 


K 


K 


KV 


IK 


RS 


C 


K 


Q 


T 


T 


gi (12711660 j 


K 


K 


KV 


IK 


RS 


c 


K 


Q 


T 


T 



360 370 380 390 400 



NOV2 COR87 94 0554 L V 

gi|l5212448| L V 

gi|l5277312| L V 

gi|6933864'j I I 

gi|l6758634| I I 

gi|l2711660j I I 



NOV2 COR87940554 
gi|l5212448| 
gi|l5277312| 
gi j 6933864 | 
gi j 16758634 | 
gij 12711660 | 



410 


420 


430 


440 




i 


...|.... 
A 


|....|....| 


|....|....| 


|....|.. 
K 


..|. 
R 


N 


A 






K 


R 


N 


A 






K 


R 


N 


s 






R 


V 


A 


S 






R 


V 


A 


S 






R 


V 


A 



450 



460 470 480 490 500 



NOV2 COR87940554 


H 


K 


TD 


N 


FT 


Q 


A 


R 


R 


H 


gi 1 15212448 | 


H 


K 


TD 


N 


FT 


Q 


A 


R 


R 


H 


gi | 15277312 j 


H 


K 


TD 


N 


FT 


Q 


A 


R 


R 


H 


gi|6933864| 


D 


A 


QN 


D 


YS 


K 


N 


Q 


T 


R 


gi|l6758634| 


D 


A 


QN 


D 


YS 


K 


N 


0 


T 


R 


gi j 12711660 j 


D 


A 


QN 


D 


YS 


K 


N 


Q 


T 


R 



510 520 530 540 550 

i I I I I I I I I I 

NOV2 COR87940554 PGL M ARR-G RPR Q L Q G AA E AL 

gi 1 15212448 | PGL M ARR-G RPR Q L Q G AA E AL 

gi | 15277312 j PGL M ARR-G RPR Q L Q G AA E AL 

gij 6933864 ( IAI I IKKLK KYK E S D E VP D ES 
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gi 1 16758634 j 
gi | 12711660 | 



IAI I IKKLK KYK E S D E VP D ES 

IAI I IKKLK KYK E S D E VP D ES 



560 570 580 590 600 

....|....|....|....|....|....|....|....|....|....| 

NOV2 COR87940554 L A YQPV R VRE AA Q KLRKA - L ALPP PG 

gi | 15212448 1 L A YQPV R VRE AA Q KLRKA - L ALPP PG 

gi | 15277312 j L A YQPV R VRE AA Q KLRKA - L ALPP PG 

gi | 6933864 | Y G HKTM K IKD SL K QRQLV E Q KKKQ ESSLKQQVE 

gi | 16758634 ) Y G HKTM K IKD SL K QRQLV E Q KRKQ ESSFKQQNE 

gi | 12711660 | Y G HKTM K IKD SL K QRQLV E Q KKKQ ESSLKQQVE 



610 



620 



630 



640 



650 



NOV2 COR87940554 
gi ( 15212448 | 
gi|l5277312| 
gi | 6933864 | 
gi | 16758634 | 
gi|l2711660| 



-PPPA 



-V MD GPPTD YP- 



-HET- 



- - - PPPA V M 

PPPA V M 

Q-SSASQ GI KQLPSASTGI T 
QQASVSQAGIQPLSVASTGI T 
Q-SSASQ G I KQLPSASTGI T 



GPPSVFPP- - 
GPPSVFPP-'- 

STTSAS STQV 
TTSAS STQV 

STTSAS STQV 



PFLFR 
PFLFR 
QLQYQ 
QLQYQ 
QLQYQ 



660 670 680 690 700 

....|.... |....|....|....|....|....|....|....|....| 

NOV2 COR87940554 

gi | 15212448 j HA Y STT CET GYLS G LDASDPAL P 

gi j 15277312 | HA Y STT CET GYLS G LDASDPAL P 

gi j 6933864 | QP I -VL GTV SGQG V TESRGG 

gi | 16758634 | QP I -VL GTV SGQG V TESRVSSQ TVS YGS QHEQAHS I GTA 

gi j 12711660 j QP I -VL GTV SGQG V TESRVSSQ TVSYGSQHEQAHSTGTV 

710 720 730 740 750 

....|....|....|....|....|....|....|....|....|....| 

NOV2 COR87940554 

gi | 15212448 | GVP SLAES HLCL AF LSIPRSG G D 

gi | 15277312 | GVP SLAES HLCL AF LSIPRSG G D 

gi | 6933864 | 

gi | 16758634 | HTV SIQAQSQPHGVYP SM QGQNQGQ S S - LAGVLSSQPVQHPQQQ 

gi | 12711660 | HIP TVQAQSQPHGVYP SV QGQSQGQ S SSLTGVSSSQPIQHPQQQ 



760 770 780 790 800 

....|....|....|....|....|....|....|....|....|....^ 

NOV2 COR87940554 

gi|l5212448| F P --- 

gi | 15277312 j F P 

gi|6933864j ' 

gi 1 16758634 | -GIQPTVPPQQAVQYSLPQAASSSEG-TVQPVSQPQ V A --- 

gi | 12711660 | QGIQQTAPPQQTVQYSLSQTSTSSEATTAQPVSQPQAPQVLPQV A KQL 

810 820 830 840 850 

....|....|....|....|....|....|....|....|....|....| 

NOV2 COR87940554 

gi|l5212448| 

gi|l5277312j 

gi | 6933864 | 

gi| 16758634 | --TQS 

gi | 12711660 | PVSQPVPTIQGEPQIPVATQPSWPVHSGAHFLPVGQPLPTPLLPQYPVS 

860 870 880 890 900 
••••I | .... | ....| ....| ... . | | )....(.... | 

NOV2 COR87940554 

gijl5212448| 

gi|l5277312j 

gi | 6933864 | 

gi | 16758634 | 

37 



gi | 12711560 j 



QIPISTPHVSTAQTGFSSLPITMAAGITQPLLTLAS SATTAAIPGVSTW 



NOV2 COR87940554 
gi 1 15212448 | 
gi|l5277312| 
gi | 6933864 | 
gi|l6753634| 
gi|l2711660| 



910 
. . j . . 



920 



930 



940 



950 



PSQLPTLLQPVTQLPSQVHPQLLQPAVQSMGIPANLGQAAEVPLSSGDVL 



960 



970 



980 



990 



1000 



NOV2 COR87940554 
gi 1 15212448 | 
gi|l5277312| 
gi j 6933864 | 
gij 16758634 | 
gi | 12711660 | 



NOV2 COR87940554 
gi | 15212448 | 
gi [15277312 | 
gi j 6933864 | 
gi|l6758634| 
gi | 12711660 | 



NOV2 COR87940554 
gi | 15212448 | 
gi|l5277312 | 
gi|6933864j 
gi|l6758634| 
gi j 12711660 j 



YQGFPPRLPPQYPGDSNIAPSSNVASVCIHSTVLSPPMPTEVLATPGYFP 



1010 



1020 



1030 



1040 



1050 



STQGV 

TWQPYVESNLLVPMGGVGGQVQVSQPGGSLAQAPTTSSQQAVLESTQGV 



1060 



1070 



1080 



1090 



1100 



YA 
YA 



A 
A 



L 
L 



VG GMG- 
VG GMG- 



SQAAPPEQTPITQSQPTQPVPLVSSV AH V M GN NAPSSSGR 
SQVAPAEPVAVAQPQATQPTTLASSV AH V M GN NVPSSSGR 



1110 1120 1130 1140 1150 

....j.... |....|....|....|....|.. ..|. ..)....] 

NOV2 COR87940554 

gi 1 15212448 1 QMR PPG NL RRP VTS DQN Q S 

gi | 15277312 | QMR PPG NL RRP VTS DQN Q S 

gi|6933864| 

gij 16758634 | HEG TTK HY KSV SRHEKTSRPK ILN NKG E R 

gi | 12711660 | HEG TTK HY KSV SRHEKTSRPK ILN NKG E R 

1160 1170 1180 1190 1200 

I I I I I I I i I I 

NOV2 COR8 794 0554 

gi | 15212448 | R S AA Y E PS DG LRRI QRVETL 

gij 15277312 | R S AA YE PS DG LRRI QRVETL 

gi|6933864| 

gij 16758634 | K N TI N D AI ES VAQV EKADEM 

gij 12711660 j K N TI N D AI ESVDQV EKADEM 

1210 1220 1230 1240 1250 

....|....|....|....|....|....|....|....|....|....| 

NOV2 COR87940554 

gi) 15212448 | KR TGPMEAAEDT SPQE E APLPAL VPLPD 

gij 15277312 | KR TGPMEAAEDT SPQE E APLPAL VPLPD 

gij6933864[ 

gij 16758634 | SE VSVEPEGDQG ESLQGKDDYGFPGSQKLEGEFKQ IAVSSM QQIGV 

gij 12711660 j SE VSVEPEGDQG ESLQGKDDYGFSGSQKLEGEFKQ IPASSM QQIGI 
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NOV2 COR87940554 
gi 1 15212448 | 
gi | 15277312 | 
gi | 6933864 | 
gi | 16758634 | 
gi|l2711660| 



N0V2 COR87940554 
gi|l5212448| 
gi| 15277312 j 
gi I 6933864 1 
gi j 16758634 | 
gi | 12711660 j 



NOV2 COR87940554 
gi|l5212448| 
gi j 15277312 j 
gi | 6933864 ) 
gi |16758634| 
gi|l2711660| 



MOV2 COR8794 05 54 
gi 1 15212448 | 
gi | 15277312 j 
gi j 6933864 | 
gi j 16758634 | 
gij 12711660 j 



NOV2 COR87940554 
gi | 15212448] 
gi | 15277312 j 
gij 6933864 | 
gi|l6758634| 
gi | 12711660 | 



NOV2 COR87940554 
gi | 15212448 | 
gij 15277312 | 
gi j 6933864 | 
gi(l6758634| 
gij 12711660 | 



NOV2 COR8794 0554 
gi|l5212448| 
gi | 15277312 | 
gi|6933864 | 
gij 16758634 | 
gi | 12711660 j 



1260 1270 1280 1290 1300 

...|....|....|.. .-(...- |....|....|....|....|....| 

SNEEL SST LEH S-WTAFSTSSSS T 

SNEEL SST LEH S-WTAFSTSSSS T 



TSSLT WH AGR FIVSPVPESRLRESKIFTSEIPDPVAASTSQG M 
TSSLT WH AGR FIVSPVPESRLRESKVFPSEITDTVAASTAQS M 

1310 1320 1330 1340 1350 

...|....|....|....|....|....|....|....|....|....| 

p p N p FS GTPISP I P 

p p Np F s GTPISP I P 



NLSHSA3S LQQAFSELKHGQMTE PNTA PNFNHP T S PFLTS 

NLSHSASS LQQAFSELRRAQMTE PNTA PNFSHT T PWPPFLSS 

1360 1370 1380 1390 1400 

....|....|....|....|....|....|....|....|....|....| 



I AGVQT VAAS TPSVSVPITSSP LND I S TS VMQ S EG AL PTD KG I GG VTT S T 
IAGVPTTAAAT- -APVPATSSPPNDISTSVIQSEVTVPTEEGIAGVATST 

1410 1420 1430 1440 1450 
I I I I I I I I I I 



ITSPPCHPS SPF PI QVS NPSPHP SP-- 

ITSPPCHPS SPF PI QVS NPSPHP SP-- 



GWASGGLTTLSVSET TLS AV STAPAWTVSTT QPVQAF GS-- 
GWTSGGLPIPPVSES VLS W ITIPAWSISTT PSLQVP TSEI 

1460 1470 1480 1490 1500 
I I I I 



LP 
LP 



IASSTGSFPSGTFSTTTGTTVSSVAVPNAKPPTVLLQQVAGNTAGVAIVT 
WSSTALYPSVTVSATSASAGGSTATPGPKPPAWSQQAAGSTTVGATLT 

1510 1520 1530 1540 1550 
I I I I I I I I I I 



FS S PE VPL CPWSSLPT P FSP T C QVT 

FS S PE VPL CPWSSLPT P FSP T C QVT 



SV T TP AMA PSLPLGSS A LAETVWSAHS LDKASHS TAG 
SV T TS STA LSIQLSSS T LAETVWSAHS LDKTSHS TTG 

1560 1570 1580 1590 1600 

....|....|....|....|....|....|....|....|....|....| 



SSPFFP CP T S F ST A 

SSPFFP CP T S F ST A 



GLSFCA SS S SGTAVSSSVSQPGIVHPLVISSAIAST VLPQPAVP S 
AFSLSA SS S PGAGVSSYISQPGGLHPLVIPSVIAST ILPQAAGP S 



39 



N0V2 COR87 94 0554 
gi | 15212448 j 
gi|!5277312; 
gi | 6933864 j 
gi | 16758634 | 
gi | 12711660 j 



NOV2 COR87940554 
gi| 15212448 | 
gi|l5277312| 
gi|6933864| 
gi |16758634| 
gi | 12711660 | 



NOV2 COR87940554 

gi|l5212448| 

gi|l5277312| 

gi|6933864 | 

gi j 16758634 | 

gi|l2711660| 



NOV2 COR87940554 
gi | 15212448 | 
gij 15277312 j 
gi j 6933864 | 
gi|l6758634| 
gi j 12711660 j 



NOV2 COR87940554 
gi|!5212448| 
gi | 15277312 j 
gi | 6933864 | 
gi | 16758634 | 
gi | 12711660 j 



NOV2 COR87940554 
gi | 15212448 | 
gi j 15277312 j 
gi | 6933864 | 
gi j 16758634 | 
gi|l2711660 j 



NOV2 COR87940554 
gi | 15212448 | 
gi | 15277312 | 
gi | 6933864 | 
gi | 16758634 | 
gi | 12711660 | 



1610 1620 1630 1640 1650 
I I | | | | I 



A SLASAFSLA MT S LS-- S G SQS P 

A SLASAFSLA MT S LS-- S G SQS P 



T PQVPNIPPL QP NVPAVQ T IHSQ Q A PNQ HTHCPEMDA 
T PQVPSIPPL QP NVPAVQ T IHSQ Q A PNQ HTHCPEVDS 

1660 1670 1680 1690 1700 



DTQSKAPGIDDIKTLEEKLRSLFSEHSSSGTQHASVSLETPLVVET-VTP 
DTQPKAPGIDDIKTLEEKLRSLFSEHSSSGAQHASVSLETSLVIESTVTP 

1710 1720 1730 1740 1750 



A SP S L LP PVA GQES --- 

A SP S L LP PVA GQES 



G I PTT AVAP S KLMT S TTS T CL TN LGTAGM VM VGT QVST GTH 
GIPTTAVAPSKLLTSTTSTCL TN LGTVAL VT WT QVST --- 

1760 1770 1780 1790 1800 

SPHTAEVESEAS PPAR L " 

SPHTAEVESEAS PPAR L - 



ASAPASTATGAKPGTT PKPSLTKTWPPVGTELSAGTVPCEQLP F P 
VSTTTSGVKPGTA SKPPLTKAPVLPVGTELPAGTLPSEQLP F P 

1810 1820 1830 1840 1850 
i I I I I I I i I I 



EA L--AP--IS E -K 

EA L--AP--IS E -K 



SLIQTQQPLEDLDAQL RTLSPETI PVTPAVGPLSTMSSTAVT A SQ 
SLTQSQQPLEDLDAQL RTLS PE 1 1 TVTS AVGPVSMAAPTAI T A TQ 

1860 1870 1880 1890 1900 

LV TSSKEP EPLPLQPTSPTL GS 

LV TSSKEP EPLPLQPTSPTL GS 



KDGTEVH VTAS S S GAGWKM SVTMDD QKERKNRSEDTK VH 

KGVS QVKE GPVLATS SGAGVFKM SVAADG QKEGKNKS EDAK VH 

1910 1920 1930 1940 1950 

....|....|....|....|....|....|....|....|--..|---.| 



PKP PQLTSE DTED AGGG RE ALAE SDRAAEGLGAGV EE 

PKP PQLTSE DTED AGGG RE ALAE SDRAAEGLGAGV EE 



FES SESSVL SSPE TLVK PNGI VSGISLDVPDSTHRTPTP AK 
FES SESSVL SSPE TLVK PNGI IPGISSDVPESAHKTTAS AK 



1960 1970 1980 1990 2000 

40 



NOV 2 COR87940554 

gij 15212448 | GDD KEPQ 

gi | 15277312 j GDD KEPQ 

gi|6933864) 

gi j 16758634 | SET QPTK RFQVTTTANKVGRFSVSRTEDKVTELKKEGPVTSP- FRDS 

gij 12711660 | SDT QPTK RFQVTTTANKVGRFSVSKTEDKITDTKKEGPVASPPFMDL 

2010 2020 2030 2040 2050 

....|....|....|....|....|....|....|....|....|....| 

NOV2 COR87940554 

gi | 15212448 j QPLS 

gi 1 15277312 | QPLS 

gi|6933864| 

gi | 16758634 [ EQTVI PAAI PKKEKPELAEPSHLN PSSDLEAAFLSRGGEDGSG HSPP 

gij 12711660 | EQAVLPAVI PKKEKPELSEPSHLN PSSDPEAAFLSRDVDDGSG HSPH 

2060 2070 2080 2090 2100 

I I I I I I I I I I 

N0V2 COR87940554 

gi | 15212448 | HPSPVWMNYSYS LC EES SG EFWA QS Q 

gij 15277312 | HPSPVWMNYSYS LC EES SG EFWA QS Q 

gi | 6933864 | 

gij 16758634 | HLCSKSLPIQTL QS NSFNSSYM SDN DI DLRL RR E 

gij 12711660 | QLSSKSLPSQNL QS NSFNSSYM SDN DI DLKL RR D 

2110 2120 2130 2140 2150 

....|....|....|....|....|....|....|....|....|....| 

N0V2 COR87940554 

gi | 15212448 1 S VET TL K D SR Q PG VA M S Q LS GSFPT 

gij 15277312 | S VET TL K D SR Q PG VA M S Q LS GSFPT 

gi|6933864| 

gi | 16758634 | K IQD SR H S TK V AV IP P G R PT SKGSK 

gij 12711660) K IQD SR H S TK V AV IP P G R PT SKGSK 

2160 2170 2180 2190 2200 

I i I I I I I i 1 i 

NOV2 COR87940554 

gi (15212448 | RN R 

gi|l5277312| R N R 

gi|6933864| 

gij 16758634 | S S SLGNKSPQLSGNLSGQSGTSVLNPQQTLHPPGNTPETGHNQL P 

gij 12711660 j S S SLGNKSPQLSGNLSGQSAASVLHPQQTLHPPGNIPESGQNQL P 

2210 2220 2230 2240 2250 

....).... |....|....|....|....|....|....|....|....| 

N0V2 COR87940554 

gi | 15212448 | SE P IMRR SLSG--S TGS E 

gij 15277312 j SE P IMRR SLSG--S TGS E 

gi|6933864| 

gij 16758634 | LK SPSSDNLYSAFTSDGAISIPSLSA Q TSST TVGGTVS QAA A 

gij 12711660 j LK SPSSDNLYSAFTSDGAISVPSLSA Q TSST TVGATVN QAA A 

2260 2270 2280 2290 2300 

....|....|....|....|....|....|....|....|....|....| 

N0V2 COR87940554 

gi | 15212448 | R-A KGV AG VGRM 

gi|l5277312| R-A KGV AG VGRM 

gi|6933864| 

gijl6758634| PPAMTS RKG TD LHKLVDNWARDAMNLSGRRGSKGHMNYEGPGMARK 

gij 12711660 | PPAMTS RKG TD LHKLVDNWARDAMNLSGRRGSKGHMNYEGPGMARK 



2310 



2320 



2330 



2340 



2350 
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gi|5933864| 

gi 1 16758634 j FSAPGQLCISMTSNMGGSTPISAASATSLGHFTKSMCPPQQYGFPAAPFG 
gi|l2711660| FSAPGQLCI SMTSNLGGSAPI SAASATSLGHFTKSMCPPQQYGFPATPFG 

2360 2370 2380 2390 

....|.-..| | | .... | .. .. | j | |.-. 

NOV2 COR87940554 -> 

gi|l5212448| 

gi | 15277312 | 

gi|6933864| 

gi | 16758634 | TQWSGTGGPAPQPLGQFQPVGTTSLQNFNISNLQKSISNPPSSNLRTT 
gi | 12711660 | AQWSGTGGPAPQPLGQFQPVGTASLQNFNISNLQKSISNPPGSNLRTT 

Tables 2E, 2F and 2G list the domain description from DOMAIN analysis results against 
NOV2. This indicates that the NOV2 sequence has properties similar to those of other proteins 
known to contain these domains. 



Table 2E. Domain Analysis of NOV2 

gnl|Smart |smart00220 , SJTKc, Serine/Threonine protein 
kinases, catalytic domain; Phosphotransferases. Serine or 
threonine-specific kinase subfamily. 

(SEQ ID NO:41) 

CD-Length = 256 residues, 98.0% aligned 
Score - 221 bits (562), Expect - le-58 

NOV 2: 176 EIGRGSFKTVYRGLDTDTTVEVAWCELQTRKLSRAERQRFSESVEMLKGLQHPNIVRFYD 235 

+ I + M II II II ++ II + +1 + 1 I+++II I Mill- II 

Sbjct : 6 VLGKGAFGKVYLARDKKTGKLVAI KVIKKEKLKKKKRERI LRE I KI LKKLDHPNIVKLYD 65 

NOV 2: 236 SWKSVLRGQVCIVLVTELMTSGTLKTYLRRFREMKPRVLQRWSRQILRGLHFLHSRVPPI 295 

+ HI || | ++ + + ++| I I I I +1 I 1+ I 

Sbjct : 66 VFED DDKLYLVMEYCEGGDLFDLLKKRGRLSEDEARFYARQI LSALE YLHSQ - - GI 119 

NOV 2: 296 LHRDLKCDNVFITGPTGSVKIGDLGLATL- - KRASFAKSVIGTPEFMAPEMY- EEKYDEA 352 

+ IIMI +1+ + I 11+ I Ml + + + 1 1 M + M M + +1+1 

Sb j ct : 12 0 IHRDLKPENILLD- SDGHVKLADFGLAKQLDSGGTLLTTFVGTPEYMAPEVLLGKGYGKA 178 
NOV 2: 353 VDVYAFGMCMLEMATSEYPYSECQNAAQI YRKVTSGRKPNSFHKVKI - PEVKEI IEGCIR 411 

u +++ |+ + |+ | + |+ i + mi i++i+ + 

Sbjct : 179 VD I WSLGVI L YELLTGKPPFPGDDQLLALFKKI GKP PP PFPP PEWKI S PEAKDLI KKLLV 23 8 

NOV 2: 412 TDKNERFTIQDLLAHAFF 429 

I +1 I ++ I ! M 
Sbjct: 239 KDPE KRLT AE E ALEH P F F 256 
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Table 2F. Domain Analysis of NOV2 




gnl|Pfam pfam00069, pkinase, Protein kinase domain. 




(SEQ 


ID NO: 


42) 




CD- Length = 


256 residues, 98.0% aligned 




Score 


= 197 


bits (500) , Expect = 2e-51 




NOV 2 


176 


EIGRGSFKTVYRGLDTDTTVEVAWCELQTRKLSRAERQRFSEEVEMLKGLQHPNIVRFYD 


235 






++ | M IN 1! II 1+ 1 II +++II I+++I+ I HUM 




Sbjct : 


6 


KLGSGAFGKVYKGKHKDTGE I VAI KILKKRS LS - EKKKRFLRE IQI LRRLSHPNI VRLLG 




NOV 2 


236 


SWKSVLRGQVCIVLVTELMTSGTLKTYLRR- FREMKPRVLQRWSRQILRGLHFLHSRVPP 


294 






+ ii i i ii mi + + ++ + 1 1 1 1 ii +i 1 1 1 




Sbjct- 


65 


VFEE DDHLYLVMEYMEGGDLFDYLRRNGLLLSEKEAKKIALQILRGLEYLHSRG-- 


1 1 o 


NOV 2: 


295 


I LHRDLKCDNVF I TGPTGS VKI GDLGLATLKRA- - -SFAKSVIGTPEFJ^APE-MYEEKYD 


350 






MINI +1+ + l + lll Mil + + +IIII+IIII + 1 




Sbjct: 


119 


IVHRDLKPENILLDEN-GTVKIADFGLARKLESSSYEKLTTFVGTPEYMAPEVLEGRGYS 


1 / / 


NOV 2. 


351 


EAVDVYAFGMCMLEMATSEYPYSECQNAAQIYRKVTSGRKPNSFHKVKIPEVKEIIEGCI 


410 






1 1 1++ 1+ + 1+ 1 + 1+ 1 I+I++I+ 1 + 




Sbjct 


178 


SKVDVWSLGVILYELLTGKLPFPGIDPLEELFRIKERPRLRLPLPPNCSEELKDLIKKCL 


237 


NOV 2 


411 


RTDKNERFT I QDLLAHAFF 429 








1 +| I +++| 1 +| 




Sbjct: 


233 


NKDPEKRPTAKEILNHPWF 256 
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Table 2G. Domain Analysis of NOV2 




gnllSmart smart002 1 9, TyrKc, Tyrosine kinase, catalytic domain; Phosphotransferases. 
Tyrosine-specific kinase subfamily. 

(SEQ ID NO: 43) 
CD-Length = 258 residues, 98.4% aligned 


Score 


= 143 


bits (361) , Expect = 2e-35 




NOV 2. 


171 


LKFDI E I GRGSFKTVYRGL - - -DTDTTVEVAWCELQTRKLSRAERQRFSEEVEMLKGLQH 

I ++i i + l M + l MM 1 + 1 + + i 1 +++ I I 


227 


Sbjct- 


1 


LTLGKKLGEGAFGEVYKGTLKGKGGVEVEVAVKTLKEDA-SEQQIEEFLREARLMRKLDH 


59 


NOV 2 


228 


PNIVRFYDSWKSVLRGQVCIVLVTELMTSGTLKTYLR- -RFREMKPRVLQRWSRQILRGL 

MH+ i + 1 1 M Mi 1 +1+ 1 ++ M 11 + 


285 


Sbjct: 


60 


PNIVKLL GVCTEEEPLMIVMEYMEGGDLLDYLRKNRPKELSLSDLLSFALQIARGM 


115 


NOV 2: 


286 


HFLHSRVPP I LHRDLKCDNVF I TGPTGS VKI GDLGLATLKRAS FAKS V I GT PE FMA 

+ 1 1+ + M M 1 + +111 1 Ml +1 +M 


341 


Sbjct. 


116 


EYLESK--NFVHRDLAARNCLVGEN-KTVKIADFGLARDLYDDDYYRKKKSPRLPIRWMA 


172 


NOV 2: 


342 


PEMYEE-KYDEAVDVYAFGMCMLEMAT-SEYPYSECQNAAQIYRKVTSG- - -RKPNSFHK 

M ++ 1+ M++M+ + 1+ 1 Mi 1 ++ + i + 1 + 


396 


Sbjct- 


173 


PESLKDGKFTSKSDVWSFGVLLWEIFTLGESPYPGMSN-EEVLEYLKKGYRLPQPPNCP- 


230 


NOV 2: 


397 


VKI PEVKE 1 1 EGCI RTDKN ERFTIQDL 423 
1+ +++ I 1 +1 1 +1 




Sbjct: 


231 


- - -DEIYDLMLQCWAEDPEDRPTFSEL 254 





The protein similarity information, expression pattern, and map location for the NOV2 
prolein and nucleic acid disclosed herein suggest that it may have important structural and/or 
physiological functions characteristic of the family. Therefore, the nucleic acids and proteins of 
the invention are useful in potential diagnostic and therapeutic applications and as a research tool. 
These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed, as well as potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) biological 
defense weapon. 
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The nucleic acids and proteins of the invention are useful in potential diagnostic and 
therapeutic applications implicated in various diseases and disorders described below and/or other 
pathologies. For example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from: Hypercalcemia, Ulcers, Hemophilia, hypercoagulation, 
5 idiopathic thrombocytopenic purpura, autoimmume disease, allergies, immunodeficiencies, 
transplantation, Graft versus host disease (GVHD), Lymphaedema, Systemic lupus 
erythematosus , Autoimmune disease, Asthma, Emphysema, Scleroderma, allergy, Diabetes, 
Autoimmune disease, Renal artery stenosis, Interstitial nephritis, Glomerulonephritis, Polycystic 
kidney disease, Systemic lupus erythematosus, Renal tubular acidosis, IgA nephropathy, 

$D Cardiovascular disease, Hypercalcemia, Lesch-Nyhan syndrome, Fertility, Cancer and other 

JE; diseases, disorders and conditions of the like. 

U1 Protein phosphorylation is a fundamental process for the regulation of cellular functions. 

n \ 

m The coordinated action of both protein kinases and phosphatases controls the levels of 
1: phosphorylation and, hence, the activity of specific target proteins. One of the predominant roles 
45 of protein phosphorylation is in signal transduction, where extracellular signals are amplified and 
H propagated by a cascade of protein phosphorylation and dephosphorylation events. Eukaryotic 

protein kinases are enzymes that belong to a very extensive family of proteins which share a 
G conserved catalytic core common with both serine/threonine and tyrosine protein kinases. There 

are a number of conserved regions in the catalytic domain of protein kinases. In the N-terminal 
20 extremity of the catalytic domain there is a glycine-rich stretch of residues in the vicinity of a 

lysine residue, which has been shown to be involved in ATP binding. In the central part of the 

catalytic domain there is a conserved aspartic acid residue which is important for the catalytic 

activity of the enzyme. 

NOV3 

25 A disclosed NOV3 nucleic acid (designated as CuraGen Acc. No. COR 1003 3 9661), which 

encodes a novel GPCR-like protein and includes the 2646 nucleotide sequence (SEQ ID NO: 13) 
shown in Table 3 A. An open reading frame for the mature protein was identified beginning with 
an ATG codon at nucleotides 800-802 and ending with a TAA codon at nucleotides 1766-1768. 
Putative untranslated regions downstream from the termination codon and upstream from the 

30 initiation codon are underlined in Table 3 A, and the start and stop codons are in bold letters. 
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Table 3A. NOV3 Nucleotide Sequence (SEQ ID NO:13) 



AAA CTCACTZ^ -AAAT* APA^a ffflAf AHA ATGTGTCCCGTGGGTCCAAGGCAA AGCATGGTTCGTTTGCTCCAGAT 
TXT^TnrzinTnrTTrAGnTGCTCAC TGAAGTTCCCTCTGGGAGGAACGGTGCTCTAGATGGGTTTTGTCAATCTAG 



ATaz^ErTTTAATGGTTTACAGTAGATTTCTC TATATTTTGCAGTAGATTATAAAATACATAATGTATATATACAGT 



GTCTCT. 



'AAA^AAAACCCAGCAAATTTATTTTCAAATACATCTGTGTGTGAGCCAATCCAAGTGGGCTGACATGGG 



TGATGTCCACATT' 



GCATGTTCAAATGCT 



'TC CCATCTGCTGTGCTGGGCATGTTCAAATGCTCTGGGTTGATTATGCAGGGCTGGATGCTGG 
GTGGGTTGATTATGCAGGGCTGGATTTTGTGCTCTTTGCCTTTGGACAGGAGCTTGGGATT 



GTGGGTCTGGAGA< 



GAATCAAAATCTGGACCACAGCACAGTTCATCTCTTGCTTCATGGAATTAGAGGGAAGACTAG 



AGCAAGTGAA* 



GCAGAAACAAAGCATCAATTGCTAGGTTCAAAGACAACCATGTCCTGTTTCTCCGTATGACATCTG 



ArTTGCGATATACATGACGCAGTTTGCTTATCTGTCAGAGTTACTACATGGTTGTTGGAACTAAACAAGTAATAAA 



TAATTGAAGTTCTGTCCTCTCCCATCACTGTCAGTATTGATGTCTTCCTCAGGTGCAGTAGAGATGGGAGCAACCA 
ATGACAGCACCTTCAGCCATTTCATCCTTATAGGCTTCTCTGACCGGCCCGAGCTGGAGAGGGTCCTCTTCGCCAT 
CATCCTGCCCGCCTACCTCCTAACCCTGCTGGGCAACAGCATCATCATCCTGGTATCCAGGCTGGACCCGCACCTT 
CACACCCCCATGTAGTTCTTTCTCACACACCTGTCCTTCCTTGACCTCAGCTTCACCAGTAGCTCCATCCCACAGC 
TACTCTATAACCTCAGCGGGCCGGACAAGACCATCAGCTATGTGGGCTGTGCTCTGCAGCTGGTCCTGTTCCTGGG 
CCTGGGGGGTGTGGAGTGCCTGCTGCTGGCTGTCATGGCCTATGACCGCTTTGTGGCGGTCTGCAAGCCCCTGCAC 
TACATGGTCATCATGAACCCCCAGCTCTGCCGGGGCTTGGTGTCAGTGACCTGGGGCTGTGGGGTGGCCAACTCCT 
TGGCCATGTCTCCTGTGACCCTGCGCTTACCCCGCTGTGGGCACCACGAGGTGGACCACTTCCTGCGTGAGATGCC 
CGCCCTGATCCGGATGGCCTGCGTCAGCACTGTGGCCATNGAAGGCACCGTCTTTGTCCTGGCGGTGGGTGCTGCC 
CTGTCCCCCTTGGTGTTTATCATGATATCTTACAGCTACATTGTGAGGGCTGTGTTACAAATTCGGTCAGCATCAG 
GAAGGCAGAAGGCCTTCGGCACCTGCGGCTCCCATCTCACTGTGGTCTCCCTTTTCTATGGAAACATCATCTACAT 
GTACATGCAGCCAGGAGCCAGTTCTTCCCAGGACCAGGGCAAGTTCCTCACGCTCTTCTACAACATTGTCACCCCC 
CTCCTCAATCCTCTCATCTACACCCTCAGAAACAGAGAGGTGAAGGGGGCACTGGGAAGGTTGCTTCTGGGGAAGA 
GAGAGGTAGGAAAGGAGTA AAGGCATCTCCACCTGACTTCACCTCCATCCAGGGCCACTGGCAGCATCTGGAACGG 
rTGAATTCCAGCTGATA TTAGCCCACGACTCCCAACTTGGCTTTTTCTGGACTTTTGTGAGGCTGTTTGAGTTCTG 

ACATTATGT 



GTTTTTGTTGTTGCTCTTAAAATTGAGACGGGGTCTCACTCTGTCACCTAGGGTGGAGTGCAGTGGT 



GCCACCATAGCTCCTTCGACTATTGGGCTTAAGCGATCCTCCCCGACCTCAGCCTTCCAAGTAACTGGGACTACAG 



GTGTG 1 



CATCACTGGCAGTGGGAATTGTGGCTTTTCTGTGTTCTATGGAGACGGGGTCTTGCTGTGTTGACCAGGGT 



GGT< 



nGGCAAACTCCTGGCCTCATGTGATCCTCCTGCCATGGCCTCCTAAAGTTCTGGGATTACAAGTGTGAGTCAC 



TGTGACT 



'GGCCAACATTATGTGATTTATGTGTGAACTATATAACACAAAT CAT CCCCAAAACC CATC ATGATCTGT 



AAAGCAGCTGCAAAGAATGAAGTGAGAGAAACAGTTGTAAAGATGAGTTTCCACCTACTTATACCAGAGTGCTAAG 



AGGAAATA ACTCTTCTCAATCAGAGCTTTGCTTTGTTTGTTGTTGTTTGCTTTAAAGTCTAACACACCTGACATGT 
TTCAGTC AGAATGACCCCAAATGCATCACTGTTCTCCACGTGGTCCAAGTGCCTCTCTGTTTAGGGCCATCAAATC 
ATGGAATGCAGCACAGTTTGATATTTTCTATATTCCCAATTCCTACCCAAACCTTTTCATGAAATCGTAGAGTTTG 



TTTTACCCTTTATCTGGTGTAAGATTCTGCATAAACCAAGAAGTGAACCTGTAATATCTATC 



The nucleic acid sequence of N0V3 maps to chromosome 1 and 629 of 918 bases (68%) 
identical to a gb:GENBANK-ID:AF098664|acc:AF098664.1 mRNA from Homo sapiens (Homo 
sapiens olfactory receptor-like protein (OR2C1) gene, complete cds). 

5 The NOV3 polypeptide (SEQ ID NO: 14) is 322 amino acid residues in length and is 

presented using the one-letter amino acid code in Table 3B. The SignalP, Psort and/or 
Hydropathy results predict that NOV3a has a signal peptide and is likely to be localized to the 
plasma membrane with a certainty of 0.6000. In alternative embodiments, a NOV3a polypeptide 
is located to the Golgi body with a certainty of 0.4000, the endoplasmic reticulum (membrane) 

1 0 with a certainty of 0.3000, or the microbody (peroxisome) with a certainty of 0.3000. The 
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Signal? predicts a likely cleavage site for a NOV3a peptide between amino acid positions 58 and 
59, i.e., at the dash in the sequence VSR-LD. 



Table 3B. Encoded NOV3 Protein Sequence (SEQ ID NO: 14) 

MSSSGAVEMGATNDSTFSHFILIGFSDRPELERVLFAIILPAYLLTLLGNSIIILVSRLDPHLHTPMYFFLTH 
LSFLDLSFTSSSIPQLLYNLSGPDKTISYVGCALQLVLFLGLGGVECLLLAVMAYDRFVAVCKPLHYMVIMNP 
QLCRGLVSVTWGCGVANSLAMSPVTLRLPRCGHHEVDHFLREMPALIRMACVSTVAXEGTVFVLAVGAALSPL 
VFIMISYSYIVRAVLQIRSASGRQKAFGTCGSHLTWSLFYGNIIYMYMQPGASSSQDQGKFLTLFYNIVTPL 
LNPLIYTLRNREVKGALGRLLLGKRELGKE 

The NOV3 amino acid sequence has 281 of 314 amino acid residues (89%) identical to, 
and 295 of 314 amino acid residues (93%) similar to, the 314 amino acid residue 
gi ll 7445344]refXP 060558.11 XM__060558 protein from Homo sapiens (Human) (similar to 
OLFACTORY RECEPTOR) (E = e-149). 

NOV3 is expressed in at least the following tissues: liver, spleen. This information was 
derived by determining the tissue sources of the sequences that were included in the invention 
including but not limited to SeqCalling sources, Public EST sources, Literature sources, and/or 
RACE sources. 

NOV3 has homology to the amino acid sequences shown in the BLASTP data listed in 
Table 3C. 



Table 3C. BLAST results for NOV3 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%> 


Positives 
(%) 


Expect 


gi j 17445344jjreJ_XP_ 
""060558 .1 | 
(XM_060558) 


similar to 
olfactory 
receptor (H. 
sapiens) 

[Homo 
sapiens] 


314 


281/314 
(89%) 


295/314 
(93%) 


e-149 


gil 5901478 j gb | AAD55 
304.1 |AF044033 1 
(AF044033) 


olfactory- 
receptor 
[Marmot a 
marmota] 


237 


196/237 
(82%) 


216/237 
(90%) 


e-102 


gi| 13624329 ref NP 


olfactory 
receptor, 
family 2, 
subfamily W, 
member 1 


320 


178/305 
(58%) 


236/305 
(77%) 


3e-97 


112165.1 | 
(NM_030903) 


gi 12054431 | ersb | CAC 


olfactory 
receptor 

[Homo 
sapiens] 


320 


178/305 
(58%) 


236/305 
(77%) 


4e-97 


20523 .1 (AJ302603) 


gi 12C54429 | eirJD | CAC 


olfactory 
receptor 

[Homo 
sapiens] 


320 


178/305 
(58%) 


236/305 
(77%) 


5e-97 


20522 (AJ302602) 
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The homology of these sequences is shown graphically in the ClustalW analysis shown in 
Table 3D. 



Table 3D. ClustalW for NOV3 

1 ) NOV3 (SEQ ID NO: 14) 

2) gi 174 45344 ref XP 060558.1 (XM_06 0558) similar to olfactory receptor (H. 
sapiens) [Homo sapiens] (SEQ ID NO:44) 

3) gd j 590147 S ] gb AAD553 04 . 1 AFC44033 1 (AF044033) olfactory receptor [Marmota 
marmota]] (SEQ ID NO:45) 

4) gi 1 1362^329 [ref jN P 11216 5 . 1 I (NM_030903 ) olfactory receptor, family 2, subfamily 
W, member 1 [Homo sapiens] (SEQ ID MO: 46) 

5 ) gi ) 1205 44 31. emb CAC20523.ll (AJ302603) olfactory receptor [Homo sapiens] (SEQ 
ID NO: 47) 

6 ) gi | 123544 29 | emb | CAC20522 . 1 [ (AJ302S02) olfactory receptor [Homo sapiens] (SEQ 
ID NO: 48) 



10 20 30 40 50 

— 1 — I — I — i — ! — ! — 1 — I — I — I 

NOV3 COR1 00339661 MSSSGAVE GAT D TFSH I DR EL RV FAIILPA L L 
GT G TQTH DR HL R FV IL A L 

q S y SLHG NH KM M SG VA F I 

QS y SLHG NH KM M SG VA F I 

QS Y SLHG NH KM M SG VA F I 



gi 


17445344 


gi 


5901478 | 


gi 


13624329 


gi 


12054431 


gi 


12054429 



V R 
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T 


V 


R 


PH 
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NOV3 COR100339661 

gi j 17445344 | 

gi | 5901478 | 

gi j 13624329 | 

gi 1 12054431 j 

gi|l2054429| 



110 120 
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NOV3 COR100339661 


GLVSVT 


GCGV 


LAMSPV 
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HHEV 


R M 


IRM 


S 


gi | 17445344 | 


GLVSVT 


GCGV 


LAMSPV 
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HHEV 


R M 
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gi | 5901478 | 


GLVSVA 


GCGM 


L MSPV 
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H KV 


M 


IRM 
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gi j 13624329 | 


KMIIMI 


SISL 


V LCTL 


N 


T 


N IL 


L 


VKI 


D 
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D 
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NOV3 COR10033 9661 


VAX 


GT 


V 


AVGAA 


S 


VF 


M 


S 


VR 


QIR ASGRQ FG 


gi 


17445344 | 


VAI 


GT 


V 


KKGV 


s 


VF 


L 


S 


VR 


QIR ASGRQ FG 


gi 


5901478] 


VAI 


GT 


V 


AVG 


s 


VF 


V 


H 


VR 


F IQ SSGRHRIF 


gi 


13624329| 


TTV 


MS 


A 


GII 


T 


IL 






AK 


TK KASQR M 


gi 


12054431| 


TTV 


MS 


A 
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TK KASQR M 
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N0V3 COR100339661 
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NOV3 COR100339661 


. . . . [ 
REV 


G 


GR LLGKRELG 


B-- 


... 






gi | 17445344 | 


REV 


G 


GR LLGKRELG 


B-- 








gi|5901478 | 
















gi[l3624329| 


KDM 


D 


KK MRFHHKST 


IKRNCKS 






gij 12054431 | 


KDM 


D 


KK MRFHHKST 


IKRNCKS 






gi|l2054429| 


KNM 


D 


KK MRFHHKST 


IKRNCKS 







Table 3E lists the domain description from DOMAIN analysis results against NOV3. 
indicates that the NOV3 sequence has properties similar to those of other proteins known to 



contain these domains. 







Table 3E. Domain Analysis of NOV3 




gn]|Pfam|pfara00001 7tm 1, 7 transmembrane receptor (rhodopsin family). 

(SEQ ID NO:49) 
CD-Length = 254 residues, 100.0% aligned 








Score = 82.0 bits (201), Expect = 5e-17 




NOV 3 


49 


GNS I IILtVSRXjDPHhRTPMYFFLTHLSFLDLSFTSSSI PQLLYNLSGPDKTI S YVGCALQ 

II 1 II M + k III + 1 Mill II 


108 


Sbjct 


1 


GNLLVI LVI LRTKKLRTPTNI FLLNLAVADLLFLLTLP PWALYYLVGGDWVFGDALCKLV 


60 


NOV 3 


109 


LVLFLGLGGVECLLLAVMAYDRFVAVCKPLHYMVIMNPQLCRGLVSVTWGCGVANSLAMS 

11+ 1 III ++ Ik+k II 1 1 k + k + 1 +1! 


168 


Sbjct 


61 


GALFVWGYASILLLTAISIDRYLAiraPLRYRRIRTPRRAKVLILLVWVLALLLSLPPL 


120 


NOV 3 


169 


- PVTLRLPRCGHHEVDHFLREMPALIRMACVSTVAXEGTVFVLAVGAALSPLVFIMI SYS 

II k 1 "1 1 + + + Ik 1" k 


227 


Sbjct- 


121 


LFSWLRTVEEGNTTVCLIDFPEESVKR S YVLLS TLV GFVLPLLV I LVCYT 


170 


NOV 3: 


228 


YIVRAV LQI RSASGRQKAFGTCGSHLTWSLFYG NI I YMYMQPGASS 

kl + k Ikl k 1 + 1 + 


274 


Sbjct: 


171 


RILRTLRKRARSQRSLKRRSSSERKAAKMLLVVVVVFVLCWLPYHIVLLLDSLCLLSIWR 


230 


NOV 3: 


275 


SQDQGKFLTLFYNIVTPLLNPLI Y 298 

+ lk 1 llkll 




Sbjct: 


231 


VLPTALL I TLWLAYVNS CLNP 1 1 Y 254 
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G-protein-coupled receptors (GPCRs) constitute a vast protein family that encompasses a 
wide range of functions (including various autocrine, paracrine and endocrine processes). They 
show considerable diversity at the sequence level, on the basis of which they can be separated into 
distinct groups. The term "clan" is used to describe the GPCRs, as they embrace a group of 
5 families for which there are indications of evolutionary relationship, but between which there is 
no statistically significant similarity in sequence. The currently known clan members include the 
rhodopsin-like GPCRs, the secretin-like GPCRs, the cAMP receptors, the fungal mating 
pheromone receptors, and the metabotropic glutamate receptor family.The metabotropic 
glutamate receptors are functionally and pharmacologically distinct from the ionotropic glutamate 
J O receptors. They are coupled to G-proteins and stimulate the inositol phosphate/Ca2+ intracellular 
Q signalling pathway. The amino acid sequences of the receptors contain high proportions of 
yi hydrophobic residues grouped into seven domains, in a manner reminiscent of the rhodopsins and 

pi other receptors believed to interact with G-proteins. However, while a similar 3D framework has 

y 

*P been proposed to account for this, there is no significant sequence identity between these and 
1 5 receptors of the rhodopsin-type family: the metabotropic glutamate receptors thus bear their own 
distinctive 7TM' signature. This 7TM signature is also shared by the calcium-sensing receptors, 

M and GABA (gamma-amino-butyric acid) type B (GABA(B)) receptors. 

83 

q The protein similarity information, expression pattern, and map location for the NOV3 

* } - a protein and nucleic acid disclosed herein suggest that it may have important structural and/or 
20 physiological functions characteristic of the GPCR family. Therefore, the nucleic acids and 

proteins of the invention are useful in potential diagnostic and therapeutic applications and as a 
research tool. These include serving as a specific or selective nucleic acid or protein diagnostic 
and/or prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to 
be assessed, as well as potential therapeutic applications such as the following: (i) a protein 
25 therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) biological 
defense weapon. 

The NOV3 nucleic acid and protein are useful in potential diagnostic and therapeutic 
30 applications implicated in various diseases and disorders described below and/or other 

pathologies. For example, the compositions of the present invention will have efficacy for 
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treatment of patients suffering from: Cardio-vascular diseases, Von Hippel-Lindau (VHL) 
syndrome, Cirrhosis,Transplantation, Hemophilia, Hypercoagulation, Idiopathic 
thrombocytopenic purpura, Immunodeficiencies, Graft versus host disorders and other diseases, 
disorders and conditions of the like. 

NOV4 

NOV4 includes two novel ankyrin repeat containing proteins. The disclosed proteins have 
been named NOV4a and NOV4b. 



NOV4a 

A disclosed NOV4a nucleic acid (designated as CuraGen Acc. No. COR87934767), 
encodes a novel ankyrin repeat containing protein and includes the 2381 nucleotide sequence 
(SEQ ID NO: 1 5) shown in Table 4A. An open reading frame for the mature protein was 
identified beginning with an ATG codon at nucleotides 849-851 and ending with a TAA codon at 
nucleotides 1965-1967. Putative untranslated regions downstream from the termination codon 
and upstream from the initiation codon are underlined in Table 4A, and the start and stop codons 
are in bold letters. 



Table 4A. NOV4a Nucleotide Sequence (SEQ ID NO:15) 

GGG AAAATTGACGGGAGGGAAGAGGGTGGAGAGCAGGACAGAGAGGGCGGTGCAGAAGGGGAATATCCCTCCTGAG~ 

TTCCC TGGAAGAGCGTCAGCCTGGACCCTGGTCTTGGGCTTCTCTGCTGGAATCCTGGGCAGCCCCGGGTGCTGCG 

G CGAGGGTCAAGGCCACACAAAGGGCAAGGGAGGCAGACGAGCCAGTCACATGGGGCAGTCGAGCTGCCTGCGTGA 

ATG CTAGGCGCGGGACAATGGCAACTCCGGGACAAAGTGCAGGGAGACTCCTGAAGAGATAAGAGGGAAGGGCGAA 

GG AAGGGGGCGGGGAGCCAGAGCCTCGGAGCTCCAGGACCGCGCTTTGGGAGACCGTGGCTGGAAGCCGAGCTCGG 

CCCGCTGCGGAAGGGGCGCCCTCGCGCCTCTACACTCTAGCCCCGGCTGGGATGCTGAGAACCGCGGCTTCCAGGG 

CCGCAGGCGAGCTCCCAGCCAGTCCCCGCGCCCGCCCTTCGGTGCTGGAGGCGGGGCTGCCGAGCTCACCTGGCCG 

TTTGG GGTGGGACCGCCCGCGACCCGGGGGAGCTGCAGAGGCGGCGGTACCCAGGGAAGTGGAGCTGGGCTTGCCC 

TGG GGACTTGGCTGGAGCTCACACCCCTCCACGCCCCCCAAGGCCTGCGCGGGGGCCCTCCCCTAGCTCCCTCCCT 

CCTCC TCCTCCTCCTCCTCCTCCTCTCCTTTGCTCGCTCCCTCCGAACCCAATTGCTCAAGCAGCTTCCTTCCCCA 

ACGCC AGCGCCAGTTCCTCTCCCGTTGGGGCCCGGGAAGGGCAGCTAACGCTGGACACTGGGACGGCCGCGGCGGC 

AGCTTCAAGACCATGGCCCAGCTCGGAGGGGCCGCGAACCGGGCACCCACGGCCTCTCTCGCGCCGACCTCGCAGA 

GCCTGCGGTGCGCCCCGCAGCCCCGCCCCTCGAGAGCGGACACTGGTAGCCTGGGCAGGTACTGGGGCAAAGCCGC 

AGCCGCCGCCTCCCGGGAGCACCCCTTCCCAGGCACGCTGATGCACTCTGCAGCGGGCTCAGGGCGCCGGCGGGGA 

GCGCTGCGGGAACTGCTGGGGCTGCAGCGGGCGGCTCCTGCGGGGTGGCTGTCGGAGGAGCGCGCCGAGGAGCTGG 

GCGGGCCGAGTGGGCCGGGCAGCAGCAGGCTGTGCCTGGAACCGCGGGAGCACGCGTGGATTCTGGCAGCCGCCGA 

GGGCCGCTATGAGGTGCTGCGGGAGCTGCTGGAGGCTGAGCCGGAGCTGCTGCTGAGGGGCGACCCGATCACCGGC 

TACTCGGTTCTGCACTGGCTGGCCAAGCACGGGCGCCACGAGGAGCTCATTCTGGTACACGACTTCGCCCTACGCC 

GGGGGCTGAGGCTCGACGTGAGCGCCCCAGGCAGCGGCGGCCTCACGCCCCTCCACCTGGCGGCCCTTCAGGGCCA 

CGACATGGTCATCAAGGTGCTGGTGGGCGCCCTGGGTGCTGACGCTACGCGCCGCGACCACAGCGGCCACCGGGCC 

TGCCACTACCTGCGGCCCGACGCGCCTTGGAGGTTGCGGGAGCTGTCGGGAGCCGAGGAATGGGAGATGGAGAGCG 

GCAGCGGGTGCACCAACCTGAACAACAACAGCAGCGGCACCACTGCGTGGAGGGCCGCGAGCGCAGTGGGGCGCGA 

ACGGCTGTGGAGACAAGCAGGAGAGTGGCAGCGTCGCGGACCAAGGCGAAGGACACCGCGGGCAGCCGGGTGGCGC 
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AAATGCATAGCCTTTTCCGCCATCTGTTCCCCTCATTCCAGGACCGTTGACAGGGACAGAGACTGGAGAGCTAGGA 
GGGGCTGTGACACTGTGGCGATGGCTAGGTCCTGGGTTGTCCCGGGTTCCACCGAAGGAGAGGCGCCTTGGACGCT 
GCTTGGGCCTGCAAGGAACAGAACACGTCGGGGTCCGACTCAGGTACTTGTCTCAGGTCTCCTGTAACCACCGGCC 
TGGAGGACCCGGGGACTCGGGCACCACNTCACCAAGAGAGAGTGAAGGACCAAGCTGGCCTGGCTCCGAGTTCCAA 
AGCTACAGGACTAAGGAGTTGGGAGCAGGGAGCGTGGTCCTGCTTGGGAGAGGGCAAGTTAAGCTTGCAGGGGCCA 
TTTCTGGGCAGGCCGACGCGCTGGGTTTATTAGGAAACATTCGCTAGAAGAATGAGTTAAGATTGTAAAGGACCAA 
TGCAGAGAAAACGCCTAACTCTGCCGGCCTCGCTCGGCCATTAATGGGTCTTGGGGTGCGGGTAGAGTCAGCCTCT 
GACAACCTCCTCCTGAGACGACCCAGCCTTACTGGTACTTTTTCTCATGTATCACAGGTTACTTCTTATGTATATT 
AAAGTGGAATATGTGTTCTTTTCAC 



The nucleic acid sequence of NOV4a maps to chromosome X and has 764 of 1297 bases 
(58%) identical to a gb:GENBANK-ID:AK025523|acc:AK025523.1 mRNA from Homo sapiens 
(Homo sapiens cDNA: FLJ21870 fis, clone HEP02445). 
i J The NOV4a polypeptide (SEQ ID NO: 16) is 372 amino acid residues in length and is 

presented using the one-letter amino acid code in Table 4B. The SignalP, Psort and/or 
m Hydropathy results predict that NOV4a has a signal peptide and is likely to be localized to the 
m microbody (peroxisome) with a certainty of 0.4763. In alternative embodiments, a NOV4a 
f: polypeptide is located to the nucleus with a certainty of 0.3000, the lysosome (lumen) with a 

J 0 certainty of 0.2592, or the mitochondrial matrix space with a certainty of 0. 1 000. 

O 



Table 4B. Encoded NOV4a Protein Sequence (SEQ ID NO:16) 

MAQLGGAANRAPTASLAPTSQSLRCAPQPRPSRADTGSLGRYWGKAAAAASREHPFPGTLMHSAAGSGRRRGA 
LRELLGLQRAAPAGWLSEERAEELGGPSGPGSSRLCLEPREHAWILAAAEGRYEVLRELLEAEPELLLRGDPI 
TGYS VLHWL AKHGRHEEL I LVHD FALRRGLRLD VS APGS GGLTPLHL AALQGHDMVI KVLVGALGAD ATRRDH 
SGHRACHYLRPDAPWRLRELSGAEEWEMESGSGCTNLNNNSSGTTAWRAASAVGRERLWRQAGEWQRRGPRRR 
TPRAAGWRKCIAPSAICSPHSRTVDRDRDWRARRGCDTVAMARSWWPGSTEGEAPWTLLGPARlSrRTRRGPTQ 
VLVSGLL 



The NOV4a amino acid sequence has 273 of 273 amino acid residues (100%) identical to, 
and 273 of 273 amino acid residues (100%) similar to, the 314 amino acid residue 
15 gi |17486018|rcf,XP 066736.11 XM_066736 protein (similar to LD31582p, H. sapiens) (E - e- 125 ). 

NOV4a is predicted to be expressed in the following tissues because of the expression 
pattern of (GENBANK-ID: gb:GENBANK~ID:AK025523|acc:AK025523.1) a closely related 
Homo sapiens cDNA: FLJ21870 fis, clone HEP02445 homolog in species Homo sapiens: uterus, 
lung, kidney, brain and placenta. 

20 

NOV4b 

A disclosed NOV4b nucleic acid (designated as CuraGen Acc. No. CG57238-01), a 

variant of NOV4a, includes the 1209 nucleotide sequence (SEQ ID NO: 17) shown in Table 4C. 
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Table 4C. NOV4b Nucleotide Sequence (SEQ ID NO:17) 

AGCTAACGCTGGACACTGGGACGGCCGCGGCGGCAGCTTCAAGACCATGGCCCAGCTCGGAGGGGCCGCG 
AACCGGGCACCCACGGCCTCTCTCGCGCCGACCTCGCAGAGCCTGCGGTGCGCCCCGCAGCCCCGCCCCT 
CGAGAGCGGACACTGGTAGCCTGGGCAGGTACTGGGGCAAAGCCGCAGCCGCCGCCTCCCGGGAGCACCC 
CTTCCCAGGCACGCTGATGCACTCTGCAGCGGGCTCAGGGCGCCGGCGGGGAGCGCTGCGGGAACTGCTG 
GGGCTGCAGCGGGCGGCTCCTGCCGGGTGGCTGTCGGAGGAGCGCGCCGAGGAGCTGGGCGGGCCGAGTG 
GGCCGGGCAGCAGCAGGCTGTGCCTGGAACCGCGGGAGCACGCGTGGATTCTGGCAGCCGCCGAGGGCCG 
CTATGAGGTGCTGCGGGAGCTGCTGGAGGCTGAGCCGGAGCTGCTGCTGAGGGGCGACCCGATCACCGGC 
TACTCGGTTCTGCACTGGCTGGCCAAGCACGGGCGCCACGAGGAGCTCATTCTGGTACACGACTTCGCCC 
TACGCCGGGGGCTGAGGCTCGACGTGAGCGCCCCAGGCAGCGGCGGCCTCACGCCCCTCCACCTGGCGGC 
CCTTCAGGGCCACGACATGGTCATCAAGGTGCTGGTGGGCGCCCTGGGTGCTGACGCTACGCGCCGCGAC 
CACAGCGGCCACCGGGCCTGCCACTACCTGCGGCCCGACGCGCCTTGGAGGTTGCGGGAGCTGTCGGGAG 
CCGAGGAATGGGAGATGGAGAGCGGCAGCGGGTGCACCAACCTGAACAACAACAGCAGCGGCACCACTGC 
GTGGAGGGCCGCGAGCGCAGTGGGCGCGACGGCTGTGGAGACAAGCAGGAGAGTGGCAGCGTCGCGGACC 
AAGGCGAAGGACACCGCGGGCAGCCGGGTGGCGCAAATGCATAGCCTTTTCCGCCATCTGTTCCCCTCAT 
TCCAGGACCGTTGACAGGGACAGAGACTGGAGAGCTAGGAGGGGCTGTGACACTGTGGCGATGGCTAGGT 
CCTGGGTTGCCCCGGGTTCCACCGAAGGAGAGGCGCCTTGGACGCTGCTTGGGCCTGCAAGGAACAGAAC 
ACGTCGGGGTCCGACTCAGGTACTTGTCTCAGGTCTCCTGTAACCACCGGCCTGGAGGACCCGGGGACTC 
GGGCACCACTTCACCAAGA 



The nucleic acid sequence of NOV4b maps to chromosome X and has has 764 of 1297 
bases (58%) identical to a gb:GENBANK4D:AK025523|acc:AK025523.1 mRNA from Homo 
sapiens (Homo sapiens cDNA: FLJ21870 fis, clone HEP02445). 

The NOV4a polypeptide (SEQ ID NO: 18) is 315 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 4D. The SignalP, Psort and/or 
Hydropathy results predict that NOV4b has a signal peptide and is likely to be localized to the 
microbody (peroxisome) with a certainty of 0.4763. In alternative embodiments, a NOV4b 
polypeptide is located to the nucleus with a certainty of 0.3000, the lysosome (lumen) with a 
certainty of 0.2592, or the mitochondrial matrix space with a certainty of 0.1000. 



Table 4D. Encoded NOV4b Protein Sequence (SEQ ID NO:18) 

MAQLGGAANRAPTASLAPTSQSLRCAPQPRPSRADTC 

LLGLQRAAPAGWLSEERAEELGGPSGPGSSRLCLEPREHAWILAAAEGRYEVLRELLEAEPELLLRGDPITGYSVL 
HWLAKHGRHEELILVHDFALRRGLRLDVSAPGSGGLTPLHL.AALQGHDMVIKVLVGALGADATRRDHSGHRACHYL 
RPDAPWRLRELSGAEEWEMESGSGCTNLNNNSSGTTAWRAASAVGATAVETSRRVAASRTKAKDTAGSRVAQMHSL 
FRHLFPSFQDR 



The NOV4b amino acid sequence has 273 of 273 amino acid residues (100%) identical to, 
and 273 of 273 amino acid residues (100%) similar to, the 314 amino acid residue 
gi| 1 748601 8 jrefXP 066736. 1 ] XM_066736 protein (similar to LD31582p, H. sapiens) (E = e- 
125). 
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NOV4b is predicted to be expressed in the following tissues because of the expression 
pattern of (GENBANK-ID: gb:GENBANK-ID:AK025523|acc:AK025523.1) a closely related 
Homo sapiens cDNA: FLJ21870 fis, clone HEP02445 homolog in species Homo sapiens: uterus, 
lung, kidney, brain and placenta. 

NOV4a and NOV4b are very closely homologous as is shown in the amino acid alignment 
in Table 4E. 

Table 4E. Amino Acid Alignment of NOV4a and NOV4b 
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COR87934767 
CG57238-01 



COR87934767 
CG57238-01 



COR87934767 
CG57238-01 
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CG57238-01 
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COR87934767 
CG57238-01 
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COR8 7934767 KC I AFSAI CS PHSRTVDRDR WRAR GCDT MAR WWPGSTEGEAPWT 
CG57238-01 KAK TAGS QMH LFR H 



COR87934767 
CG57238-01 



360 370 
....|....|....|....|.. 
LGPARN TRRG PTQVLVSGLL 
FPSFQD 



Homologies to any of the above NOV4 proteins will be shared by the other NOV4 
proteins insofar as they are homologous to each other as shown above. Any reference to NOV4 is 
assumed to refer to both of the NOV4 proteins in general, unless otherwise noted. 

NOV4a also has homology to the amino acid sequence shown in the BLASTP data listed 
in Table 4F. 
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Table 4F. BLAST results for NOV4 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


identity 
(%) 


Positives 
{%) 


Expect 


gil 17436018 ref XP 


similar to 
LD31532p (H. 
sapiens) 

[Homo 
sapiens] 


315 


273/273 
(100%) 


273/273 
(100%) 


e-125 


066736.1 i 
(XM_066736) 



The homology of these sequences is shown graphically in the ClustalW analysis shown in 
Table 4G. 

Table 4G. ClustalW Analysis for NOV4 

1) NOV4a (SEQ ID NO: 16) 

2) NOV4b (SEQ ID NO: 18) 

3 > gi 1 17486018 ref ;XP_066736 , 1 ' (XM_06<S736) similar to LD31582p (H. sapiens) [Homo 
sapiens] (SEQ ID NO: 50) 



10 20 30 40 50 



NOV4a COR87934767 
NOV4b CG57238-01 
gi| 17486018 | 



60 70 80 90 100 



NOV4a COR87934 767 
NOV4b CG57238-01 
gi| 17486018 | 



110 120 130 140 150 



NOV4a COR87934767 
NOV4b CG5 723 8-01 
gi | 17486018 | 



160 170 180 190 200 



NOV4a COR87934767 
NOV4b CG5723 8-01 
gi|l7486018| 



210 220 230 240 250 



MOV4a COR87934767 
NOV4b CG57238-01 
gi|l7486018| 



260 270 280 290 300 



NOV4a COR8 7934 767 RERLWRQ G WQ GPRR PRAAGWR 

NOV4b CG57238-01 

gi|l7486018| 



310 320 330 340 350 

I I i I I I I I I I 

NOV4a COR87934767 KC I AFSAI CS PHSRTVDRDR WRAR GCDT MAR WWPGSTEGEAPWT 

NOV4b CG5723 8-01 

gi|l7486018| 



360 370 



NOV4a COR87934 7 67 LGPARN TRRGPTQVLVSGLL 
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N0V4b CG57238-01 

gi|!7486018| 

Table 4H lists the domain description from DOMAIN analysis results against NOV4. 
This indicates that the NOV4 sequence has properties similar to those of other proteins known to 
contain these domains. 

Table 4H. Domain Analysis of NOV4 

gnljPfam.pfam00()23, ank, Ank repeat. Ankyrin repeats generally consist of a beta ? 
alpha, alpha, beta order of secondary structures. The repeats associate to form a 
higher order structure. 

(SSQ ID NO: 51) 

CD-Length = 33 residues, 97.0% aligned 

Score = 35.4 bits (80), Expect = 0.006 

NOV 4: 187 GLTPLHLAALQGHDMVI KVLVGALGADATRRDH 219 

I IIIIMI II 1 + 1 + 1+ I III li 

Sbjct: 2 GNTPLHLAARNGHLEWKL LLEA- GA DVNARDK 33 

The protein similarity information, expression pattern, and map location for the NOV4 
proteins and nucleic acids disclosed herein suggest that it may have important structural and/or 
physiological functions characteristic of the family. Therefore, the nucleic acids and proteins of 
the invention are useful in potential diagnostic and therapeutic applications and as a research tool. 
These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed, as well as potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cyto toxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) biological 
defense weapon. 

The nucleic acids and proteins of the invention are useful in potential diagnostic and 
therapeutic applications implicated in various diseases and disorders described below and/or other 
pathologies. For example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from: Cardio-vascular disorders, Cardiomyopathy, 
Atherosclerosis, Hypertension, Congenital heart defects, Aortic stenosis, Atrial septal defect 
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(ASD), Atrioventricular (A-V) canal defect, Ductus arteriosus , Pulmonary stenosis , Subaortic 
stenosis, Ventricular septal defect (VSD), valve diseases, Tuberous sclerosis, Scleroderma, 
Obesity, Transplantation, Systemic lupus erythematosus , Autoimmune disease, Asthma, 
Emphysema, Scleroderma, allergy, Diabetes, Autoimmune disease, Renal artery stenosis, 
Interstitial nephritis, Glomerulonephritis, Polycystic kidney disease, Systemic lupus 
erythematosus, Renal tubular acidosis, IgA nephropathy, Hypercalcemia, Lesch-Nyhan syndrome 
and other diseases, disorders and conditions of the like.Ankyrin repeats are tandemly repeated 
modules of about 33 amino acids. They occur in a large number of functionally diverse proteins 
mainly from eukaryotes. The few known examples from prokaryotes and viruses may be the result 
of horizontal gene transfers. The conserved fold of the ankyrin repeat unit is known from several 
crystal and solution structures, e.g., from: p53-binding protein 53BP2, cyclin-dependent kinase 
inhibitor pl9Ink4d, transcriptional regulator GABP-beta, and NF-kappaB inhibitory protein IkB- 
alpha. It has has been described as an L-shaped structure consisting of a beta-hairpin and two 
alpha-helices. Many ankyrin repeat regions are known to function as protein-protein interaction 
domains. 

NOV5 

A disclosed NOV5 nucleic acid (designated as CuraGen Acc. No. COR1 00396092), 
encodes a novel ankyrin repeat containing protein and includes the 6272 nucleotide sequence 
(SEQ ID NO: 19) shown in Table 5A. An open reading frame for the mature protein was 
identified beginning with an ATG codon at nucleotides 7-9and ending with a TGA codon at 
nucleotides 6181-6183. Putative untranslated regions downstream from the termination codon 
and upstream from the initiation codon are underlined in Table 5 A, and the start and stop codons 
are in bold letters. 

Table 5A. NOV5 Nucleotide Sequence (SEQ ID NO: 19) 

AGGACGATGCCCAAGGGTGGGTGCCCTAAAGCACCACAGCAGGAAGAGCTTCCCCTCAGCAGCGACATGGTGGAGA 
AGCAGACTGGGAAAAAGAAAGATAAAGTTTCTCTAACCAAGACCCCAAAACTGGAGCGTGGCGATGGCGGGAAGGA 
GGTGAGGGAGCGAGCCAGCAAGCGGAAGCTGCCCTTCACCGCGGGCGCCAATGGGGAGCAGAAGGACTCGGACACA 
GGTACCAGCCCGACAGCCTTACCTCTGTGTGACCCCTTCACATACACTGCGGAAGAAGCCAAAGCTGAAAGGCAGA 
AGCAGGGCCCTGAGCGGAAGAGGATTAAGAAGGAGCCTGTCACCCGGAAGGCCGGGCTGTCTGGAATCCGAGCCGG 
CTACCCCCTCTCCGAGCGCCAGCAGGTGGCCCTTCTCATGCAGATGACGGCCGAGGAGTCTGCCAACAGCCCAGTG 
GACACAACACCAAAGCACCCCTCCCAGTCTACAGTGTGTCAGAAGGGAACGCCCAACTCTGCCTCAAAAACCAAAG 
ATAAAGTGAACAAGAGAAACGAGCGTGGAGAGACCCGCCTGCACCGAGCCGCCATCCGCGGGGACGCCCGGCGCAT 
CAAAGAGCTCATCAGCGAGGGGGCAGACGTCAACGTCAAGGACTTCGCAGGCTGGACGGCGCTGCACGAGGCCTGT 
AACCGGGGCTACTACGACGTCGCGAAGCAGCTGCTGGCTGCAGGTGCGGAGGTGAACACCAAGGGCCTAGATGACG 
ACACGCCTTTGCACGACGCTGCCAACAACGGGCACCAGGTGGTGAAGCTGCTGCTGCGGTACGGAGGGAACCCGCA 
GCAGAGCAACAGGAAAGGCGAGACGCCGCTGAAAGTGGCCAACTCCCCCACGATGGTGAACCTCCTGTTAGGCAAA 
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GGCACTTACACTTCCAGCGAGGAGAGCAGCTCAGAAGAGGAAGACGCACCATCCTTCGCACCTTCCAGTTCAGTCG 
ACGGCAACAACACGGACTCCGAGTTCGAAAAAGGCCTCAAGCACAAGGCCAAGAACCCAGAGCCACAGAAGGCCAC 
GGCCCCCGTCAAGGACGAGTATGAGTTTGATGAGGACGACGAGCAGGACAGGGTTCCTCCGGTGGACGACAAGCAC 
CTATTGAAAAAGGACTACAGAAAAGAAACGAAATCCAATAGTTTTATCTCTATACCCAAAATGGAGGTTAAAAGTT 
ACACTAAAAATAACACGATTGCACCAAAGAAAGCGTCCCATCGTATCCTGTCAGACACGTCGGACGAGGAGGACGC 
GAGTGTCACCGTGGGGACAGGAGAGAAGCTGAGACTCTCGGCACATACGATATTGCCTGGTAGTAAGACACGAGAG 
CCTTCTAATGCCAAGCAGCAGAAGGAAAAAAATAAAGTGAAAAAGAAGC 

TTCGCTTCGGAAAGCGGAGCGACAAGTTCTGCTCCTCGGAGTCGGAGAGCGAGTCCTCAGAGAGTGGGGAGGATGA 

CAGGGACTCTCTGGGGAGCTCTGGCTGCCTCAAGGGGTCCCCGCTGGTGCTGAAGGACCCCTCCCTGTTCAGCTCC 

CTCTCTGCCTCCTCCACCTCGTCTCACGGGAGCTCTGCCGCCCAGAAGCAGAACGACCAGCACACCAAGCACTGGA 

AAACCATTTCTTCCCCGGCTTGGTCAGAGGTCAGTTCTTTATCAGACTCCACAAGGACGAGACTGACAAGCGAGTC 

TGACTACTCCTCTGAGGGCTCCAGTGTGGAATCGCTGAAGCCAGTGAGGAAGAGGCAGGAGCACAGGAAGCGAGCC 

TCCCTGTCGGAGAAGAAGAGCCCCTTCCTGTCCAGCGCGGAGGGCGCTGTCCCCAAACTGGACAAGGAGGGGAAAG 

TTGTCAAAAAACATAAAACAAAACACAAACACAAAAACAAGGAGAAGATCAGCCAAGAGCTGAAGTTGAAAAGTT 

TACTTACGAATATGAGGAC TC C AAGC AGAAGTCAGATAAGG CTATACTGTTAGAGAATGAT C TTTC CACTGAAAAC 

AAGCTAAAAGTGTTAAAGCACGATCGCGACCACTTTAAAAAAGAAGAGAAACTTAGCAAAATGAAATTAGAAGAAA 

AAGAATGGC TCTTTAAAGATGAAAAATC ACTGAAGAGAAT C AAAGACAAACTGAGACTGTAC AAAGAGGAGAGAGA 

CAAAATTTCAAAAGAGAAGGAGAAGATTTTTAAAGAAGATAAAGAAAA 

GATAGCCTTTCTGACCGGGATTCATCCTTTGATTTCAAAGGGGCCAAGCTCATCTTGGAGACGGTGAAGGAGGACA 

G C AAGGAGAGGAGGCGGGACAGCCGGGCCCGGGAGAAGCAC C C AGCACGAGAGAAGGAGAAGC CCGATAAGAGGAA 

GAGATACAAAGAGAAAGACAAGGACAAAAGTGAGAAATCAATCCTGGAAAAATGTCAGAAGGACAAAGAGAAAAAA 

GAAAAACATAAAGACACACATGGCAAAGACAAAGAAAGGAAAGCGTCTGTCTTTGAAAAGCACAAGGAGAAGAAGG 

ATAAAGAGTCCACAGAAAAGTACAAGGACAGAGCCTCAGTGGACTCCACGCAAGATAAGAAAAATAAACAGGAGAA 

GGCTGAAAAGAAGCACGCTGCCGAAGACAAGGCTAAAAGCAAACACAAAGAGAAGTCGGACAAAGAACATTCCAAG 

GAGAGGAAGTCCTCGAGAAGTGCCGACGCGGAATACAGAGAAAGCGAGGTCTCCTCTGACAGCTTCACGGACCGAG 

AGGACGACAAGAGCGCCTGCCTCCCTGAGAAGCTGAAAGAGAAGAGGCACAGACACTCCTCATCTTCATCCAAGAA 

GAGCC ACGAC CGAGAGGAGAAGAAAGAGGATTAC AAGGAGGGC AGGAAGGGC CAGTACGAAAAGG AC CTGGAGGCG 

GATGCTTACGGAGTTTCTTACAACATGAAAGCTATTGAATTGTTTGAAAAGAAAGATAAAAATGATGAACCTCTAA 

AAGAGAAGAAGAAGAGAGAGAAACACAGGGAGAAATGGAGAGACGAGAAGGAGAGGCACCGGGACAGGCATGCGGA 

TAGGCCGAAGCCATCCAAAGACCCAGGCAAGAAAGACGCCAGGCCCAGGGAGAAGCTCCTGGGGGACGGCGACCTG 

ATGATGACCAGCTTCGAGAGGATGCTGTCCCAGAAGGACCTGGAGATCGAGGAGCGCCACAAGCGGCACAAGGAGA 

GGATGAAGCAAATGGAGAAGCTGAGGCACCGGTCCGGAGACCCCAAGCTCAAGGAGAAGGCGAAGCCGGCAGACGA 

CGGGCGGAAGAAGGGTCTGGACATTCCTGCTAAGAAACCGCCGGGGCTGGACCCTCCATTTAAAGACAAAAAGCTC 

AAAGAGTCGACTCCTATTCCACCTGCCGCGGAAAATAAGCTACACCCAGCATCAGGTGCAGACTCCAAAGACTGGC 

TGGCAGGCCCTCACATGAAAGAGGTCCTGCCTGCGTCCCCCAGGCCTGACCAGAGCCGGCCCACTGGCGTGCCCAC 

CCCTACGTCGGTGCTATCCTGCCCCAGCTACGAGGAGGTGATGCACACGCCCAGGACCCCGTCCTGCAGCGCCGAT 

GACTACGCGGACCTCGTGTTCGACTGCGCCGACTCGCAGCACTCCACGCCCGTGCCCACCGCTCCCACCAGCGCCT 

GCTCCCCCTCCTTTTTCGACAGGTTCTCCGTGGCTTCAAGTGGGCTTTCGGAAAACGCCAGCCAGGCTCCTGCCAG 

GCCTCTCTCCACAAACCTTTACCGCTCGGTCTCTGTCGACATTGACAAGCTCTTCAGGCAGCAGAGCGTTCCTGCT 

GCCTCCAGCTACGACTCTCCCATGCCACCCTCGATGGAAGACAGGGCGCCCCTGCCCCCGGTTCCCGCGGAGAAGT 

TTGCCTGCTTGTGGCCAGGGTACTACTCCCCAGACTATGGCCTCCCGTCGCCCAAAGTCGACGCTTTGCACTGCCC 

ACCGGCTGCCGTTGTCACTGTCACCCCGTCTCCAGAGGGCGTCTTCTCAAGTTTACAAGCAAAACCTTCCCCTTCC 

CCCCCTTCCCTGGACACCTCCGAGGACCAGCAGGCGACGGCCGCCATCATCCCCCCGGAGCCCAGCTACCTGGAGC 

CGCTGGACGAGGGTCCCTTCAGCGCCGTCATCACCGAGGAGCCCGTTGAGTGGGCCCACCCCTCCGAGCAGGCGCT 

TGCCTCTAGCCTGATCGGGGGCACCTCTGAAAACCCTGTGAGCTGGCCTGTGGGCTCGGACCTCCTGCTGAAGTCT 

CCACAGAGATTCCCCGAGTCCCCAAAGCGTTTCTGCCCCGCGGACCCCCTCCACTCTGCCGCCCCAGGGCCCTTCA 

GCGCCTCGGAGGCGCCGTACCCCGCCCCTCCCGCCTCTCCTGCCCCGTACGCTCTGCCCGTCGCTGAGCTGGAGGA 

CGTCAAGGACGTCCCCGCCGCCATCTCCACCTCAGAGGCGGCTCCCTACGCCCCTCCCTGCGGGCTGGAGTCCTTC 

TTCAGCAACTGCAAGTCACTTCCGGAAGCCCCGCTGGACGTGGCCCCCGAGGCTCTGGGGCCCCTGGAAAATAGCT 

TCCTGGACGGCAGCCGCGGCCTGTCTCACCTCGGCCAGGTGGAGCCGGTGCCCTGGGCGGACGCCTTCGCCGGCCC 

CGAGGACGACCTGGACCTGGGGCCCTTCTCCCTGCCGGAGCTTCCCCTGCAGACTAAAGATGCCGCAGATGGTGAA 

GCGGAACCCGTGGAAGAAAGTCTTGCTCCTCCAGAAGAGATGCCTCCAGGGGCCCCCCGGGAGCTCGAGCCTGAGC 

CCTCAGGGGAGCCAAAGCTGGACGTGGCTCTAGAAGCTGCGGTGGAGGCGGAGACGGTGCCGGAAGAGAGGGCCCG 

TGGGGATCCGGACTCCAGCGTGGAGCCCGCGCCCGTTCCCCCAGAACAGCTGGGGAGCGGAGACCCCTCCCTCTGT 

GCCCCTGACGGCCCCGCCCCGAACACTGTGGCACAAGCTCAGGCCGCAGACGGTGCCGGCCCCGAGGACGACACTG 

AGGCCTCCCGTGCCGCCGCCCCAGCCGAAGGCCCTCCTGGCCAGCCGGAAGCCGCAGAACCAAAACCCACGGCCGA 

AGCCCCGAAGGCCCCCCGAGAGATCCCTCAGCGCATGACCAGGAACCGGGCGCAGATGCTCGCGAACCAGAGCAAG 
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CAGGGCCCGCCCCCCTCCGAGAAGGAGTGCGCCCCCACCCCTGCCCCGGTCACCAGGGCCAAGGCCCGCGGCTCCG 
AGGACGACGACGCCCAGGCCCAGCATCCGCGCAAACGCCGCTTTCAGCGCTCCACCCAGCAGCTGCAGCTGAACAC 
GTCCACGCAGCAGACGCGGGAGGTGATCCAGCAGACGCTGGCCGCCATCGTGGACGCCATCAAGCTGGATGCCATC 
GAGCCCTACCACAGCGACAGGGCCAACCCCTACTTCGAATACCTGCAGATCAGGAAGAAGATCGAGGAGAAGCGCA 
AGATCCTGTGCTGTATCACGCCGCAGGCGCCCCAGTGCTACGCCGAGTACGTCACCTACACGGGCTCCTACCTCCT 
GGACGGCAAGCCGCTCAGCAAGCTCCACATCCCCGTGATCGCACCCCCTCCCTCCCTGGCGGAGCCCCTGAAGGAG 
CTGTTCAGGCAGCAGGAGGCCGTCCGGGGAAAGCTGCGTCTACAGCACAGCATCGAGCGGGAGAAGCTGATCGTAT 
CCTGTGAGCAGGAGATTCTGCGGGTTCACTGCCGGGCGGCCAGGACCATCGCCAACCAGGCAGTGCCATTCAGCGC 
CTGCACGATGCTGCTGGACTCCGAGGTCTACAACATGCCCCTGGAGAGCCAGGGTGACGAGAACAAGTCAGTGCGC 
GACCGTTTCAACGCCCGCCAGTTCATCTCCTGGCTCCAGGACGTGGATGACAAGTATGACCGCATGAAGGTCTGCC 
TCCTCATGCGGCAGCAGCACGAGGCCGCGGCCCTGAACGCCGTGCAGAGGATGGAGTGGCAGCTGAAGGTGCAGGA 
ACTGGACCCCGCCGGGCACAAGTCCCTGTGCGTGAACGAGGTGCCCTCCTTCTACGTGCCCATGGTCGACGTCAAC 
GACGACTTTGTATTGTTGCCGGCATGACACCGCGGGACGGCCGCAGGACGCAGGCGAGGGCCGCACGGCTGCCCAG 
GACTGCTGCTGAGCCCCAGGGGCGGAGGAGGGAGCGCCCT 



The nucleic acid sequence of N0V5 maps to chromosome 16 and has 555 of 857 bases 
(64%) identical to a gb : GENB ANK-ID : AF3 1 7425 jacc : AF3 1 7425 . 1 mRNA from Homo sapiens 
(Homo sapiens GAC-1 (GAC-1) mRNA, complete cds). 

The NOV5 polypeptide (SEQ ID NO:20) is 2058 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 5B. The SignalP, Psort and/or 
Hydropathy results predict that NOV5a has a signal peptide and is likely to be localized in the 
nucleus with a certainty of 0.9800. In alternative embodiments, a NOV5a polypeptide is located 
to the microbody (peroxisome) with a certainty of 0.3000, the mitochondrial matrix space with a 
certainty of 0. 1000, or the lysosome (lumen) with a certainty of 0.1 000. 



Table 5B. Encoded NOV5 Protein Sequence (SEQ ID NO:20) 

MPKGGCPKAPQQEELPLSSDMVEKQTGKKKDKVSLTKTPKLERGDGGKEVRERA.SKRKLPFTAGANGEQKDS^ 
TGTSPTALPLCDPFTYTAEEAKAERQKQGPERKRIKKEPVTRKAGLSGIRAGYPLSERQQVALLIVIQMTAEESA 
NSPVDTTPKHPSQSTVCQKGTPNSASKTKDKW^ 

TALHEACNRGYYDVAKQLLAAGAEWTKGLDDDTPLHDAAJSTNGHQWKLLLRYGGNPQQSNRKGETPLKVANS 

PTMVNLLLGKGTYTSSEESSSEEEDAPSFAPSSSVDGNN 

DEQDRVPPVI)DKHLLKKI)YRKETKSNSFISI^ 

LRLSAHTILPGSKTREPSNAKQQKEKNKVKKKRKKETKGREVRFGKRSDKFCSSESESESSESGEDDRDSLGS 
SGCLKGSPLVLKDPSLFSSLSASSTSSHGSSAAQKQNDQHTKHWKTISSPAWSEVSSLSDSTRTRLTSESDYS 
SEGSSVESLKPVRKRQEHRKRASLSEKKSPFLSSAEGAVPKLD 
TYEYEDSKQKSDKAILLENDLSTENKLKAfL 

ERDKISKEKEKIFKEDKEKLKKEKVYREDSLSDRDSSFDFKGAKLILETVKEDSKERRRDSRAREKHPAREKE 
KPDKRKRYKEKDKDKSEKSILEKCQKDK^ 

QDKKNKQEKAEKKHAAEDKAKSKHKEKSDKEHSKERKSSRSADAEYRESEVSSDSFTDREDDKSACLPEKLKE 

KRHRHSSSSSKKSHDREEKKEDYKEGRKGQYEKD^ 

KWRDEKERHRDRHADRPKPSKDPGKKDARPREKLLGDG 

RHRSGDPKLKEKAKPADDGRKKGLDIPAKKPPGLDPPFKDKKLKESTPIPPAAENKLHPASGADSKDWLAGPH 
MKEVLPASPRPDQSRPTGVPTPTSVLSCPSYEEVMHTPRTPSCSADDYADLVFDCADSQHSTPVPTAPTSACS 
PSFFDRFSVASSGLSENASQAPARPLSTNLYRSVSVDIDKLFRQQSVPAASSYDSPMPPSMEDRAPLPPVPAE 
KFACLSPGYYSPDYGLPSPKVDALHCPPAAWTVTPSPEGVFSSLQAKPSPSPPSLDTSEDQQATAAIIPPEP 
SYLEPLDEGPFSAVITEEPVEWAHPSEQALASSLIGGTSENPVSWPVGSDIiLLKSPQRFPESPKRFCPADPLH 
SAAPGPFSASEAPYPAPPASPAPYALPVAELEDVKDVPAAISTSEAAPYAPPSGLESFFSNCKSLPEAPLDVA 
PEALGPLENSFLDGSRGLSHLGQVEPVPWADAFAGPEDDLDLGPFSLPELPLQTKDAADGEAEPVEESLAPPE 
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EMPPGAPRELEPEPSGEPKLDVALEAAVEAETVPEERARGDPDSSVEPAPVPPEQLGSGDPSLCAPDGPAPNT 
VAQAQAADGAGPEDDTEASRAAAPAEGPPGQPEAAEPKPTAEAPKAPREIPQRMTRNRAQMLANQSKQGPPPS 
EKECAPTPAPVTRAKARGSEDDDAQAQHPRKRRFQRSTQQLQLNTSTQQTREVIQQTLAAIVDAIKLDAIEPY 
HSDRANPYFEYLQIRKKIEEKRKILCCITPQAPQCYAEYVTYTGSYLLDGKPLSKLHIPVIAPPPSLAEPLKE 
LFRQQEAVRGKLRLQHSIEREKLIVSCEQEILRVHCRAARTIANQAVPFSACTMLLDSEVYNMPLESQGDENK 
SVRDRFNARQFISWIiQDVDDKYDRMKVCLLMRQQHEAAALNAVQRMEWQLKVQELDPAGHKSLCVNEVPSFYV 
PMVDVNDDFVLLPA 



The N0V5 amino acid sequence has 373 of 398 amino acid residues (93%) identical to, 
and 376 of 398 amino acid residues (93%) similar to, the 399 amino acid residue 
gill 7486Q771refXP 066756.1 ] XM_066756 protein from Homo sapiens (Human) (similar to 
5 KIAA0874 PROTEIN) (E =0.0). 

y, NOV5 is expressed in at least the following tissues: Heart, liver, Blood, Gall Bladder, 

Adrenal Gland/Suprarenal gland, Amygdala, Ascending Colon, Bone, Bone Marrow, Brain, 
yi Cervix, Dermis, Hippocampus, Kidney, Lung, Lymph node, Lymphoid tissue, Mammary 

gland/Breast, Ovary, Parotid Salivary glands, Pituitary Gland, Placenta, Prostate, Small Intestine, 
|1> Spinal Chord, Spleen, Synovium/Synovial membrane, Testis, Thymus, Thyroid, Urinary Bladder, 
a Vulva. This information was derived by determining the tissue sources of the sequences that were 
;~: included in the invention including but not limited to SeqCalling sources, Public EST sources, 
M Literature sources, and/or RACE sources. 

O NOV5 has homology to the amino acid sequences shown in the BLASTP data listed in 

"ft Table 5C. 



Table 5C. BLAST results for NOV5 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
{%) 


Positives 
(%) 


Expect 


gi| 14140238 ref NP 


KIAA0874 
protein [Homo 
sapiens] 


2062 


804/2109 
(38%) 


1142/2109 
(54%) 


0.0 


056023 .!_[ 
(NM 015208) 


gi (17486077 ref XP 


similar to 

KIAA0874 
protein (H. 
sapiens) 

[Homo 
sapiens] 


399 


373/398 
(93%) 


376/398 
(93%) 


0.0 


066756.1) 
{XM_066756) 


"374C7*".!"" 
(NM_013275) 


nasopharyngea 

1 carcinoma 
susceptibilit 
y protein 

[Homo 
sapiens] 


366 


308/366 
(84%) 


315/366 
(85% 


e-14l 
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g i , 6 6 9 ;) ^ V 7 gb | AAF2 4 
(AFi21775) 


nasopharyngea 

1 carcinoma 
susceptibilit 
y protein 
LZ16 [Homo 
sapiens] 


366 


308/366 
(84%) 


315/366 
(85% 


e-141 


gi_424C237'db;_! 3AA7 
4 897.1! (AB020681) 


KIAA0 8 74 
protein [Homo 
sapiens] 


601 


283/600 
(47%) 


365/600 
(60%) 


e-120 


qi , 17445427 ref XP 


similar to 
putative (H. 
sapiens) 
[Homo 
sapiens] 


999 


248/517 
(47%) 


301/517 
(57%) 


8e-80 


065&20.1J 
(XMJ06582 0) 



The homology of these sequences is shown graphically in the ClustalW analysis shown in 
Table 5D. 



Table 5D. ClustalW Analysis of NOV5 

1 ) NOV5 (SEQ ID NO: 20) 

2) gil!414C238 ref NP 056023.1' (NM_015208) KIAA0874 protein [Homo sapiens] (SEQ 
ID NO: 52) 

3) gi j 17486077 ref XP 066756, l! (XM_066756) similar to KIAA0874 protein (H. 
sapiens) [Homo sapiens] (SEQ ID NO: 53) 

4) gi | 7019449 1 ref [NP 037407, l| (NM_0132 75) nasopharyngeal carcinoma susceptibility 
protein [Homo sapiens] (SEQ ID NO: 54) 

5) recji | 6690397 | gb [AAF24125 . 1 A F121 775 1 (AF121775) nasopharyngeal carcinoma 
susceptibility protein LZ16 [Homo sapiens] (SEQ ID NO: 55) 

6) gi 1424023 7 [dbj jBAA74897.il (AB020681) KIAA0874 protein [Homo sapiens] (SEQ ID 
N0:56) 

7) gii!7445427 ref.XP 065820.1 (XM__065820) similar to putative (H. sapiens) [Homo 
sapiens] (SEQ ID NO: 57) 

10 20 30 40 50 

....|....|....|....|....|....|....|....|....|....| 

NOV5 COR100396092 

gi | 14140238 | MPKSGFTKPIQSENSDSDSNMVEKPYGRKSKDKIASYSKTPKIERSDVSK 

gi|l7486077| 

gi|7019449| 

gi|4240237| 

gij 17445427 | 

60 70 80 90 100 

....|....|....|....|....|....|....|....|....|....| 

NOV5 COR100396092 

gi | 1414023 8 | EMKEKSSMKRKLPFTISPSRNEERDSDTDSDPGHTSENWGERLISSYRTY 

gi | 17486077 | 

gi|7019449| 

gi|4240237| 

gi | 17445427 | 

110 120 130 140 150 

....|....|....|....|....|....|....|....|....|....| 

NOV5 COR100396092 

gi 1 14140238 | SEKEGPEKKKTKKEAGNKKSTPVSILFGYPLSERKQIVIALLMQMTARDNSP 

gi|l7486077| 

gij 7019449 | 

gi|4240237| 

gi|!7445427l 

160 170 180 190 200 

....|....|....|....|....|....|....|....|....|....| 

NOV5 COR100396092 

gi|l4140238| DSTPNHPSQTTPAQKKTPSSSSRQKDKVNKRNERGETPLHMAAIRGDVKQ 
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210 220 230 240 250 



NOVS COR10039S092 

gi | 14140238 | VKELISLGANVNVKDFAGWTPLHEACNVGYYDVAKILIAAGADVNTQGLD 

gi | 1 74 86077 | 

gi|7019449| 

gi|4240237| 

gi|l7445427| 

260 270 280 290 300 



NOV5 COR1 003 96092 

gi 1 14140238 | DDTPLHDSASSGHRDIVKLLLRHGGNPFQANKHGERPVDVAETEELELLL 

gi|l7486077| 

gij 7019449 ( 

gi|4240237| 

gij 17445427 | 

310 320 330 340 350 



NOVS COR100396092 

gi 1 14140238 j KREVPLSDDDESYTDSEEAQSVNPSSVDENIDSETEKDSLICESKQILPS 
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gi|l4140238| 

gi|!7486077| 

gi|7019449| 

gi(4240237| 

gi|!7445427| 

3210 3220 3230 3240 3250 

....|....|....|....|....|....|....|....|....|....| 

NOV5 COR1 00396092 SPPSLDTSEDQQATAAI I PPEPSYLEPLDEGPFSAVITEEPVEWAHPSEQ 
gi 

gi 
gi 
gi 

gi 



14140238| 

17486077) 

7019449] 

4240237) 

174454271 



3260 3270 3280 3290 3300 



NOV5 COR100396092 ALASSLIGGTSENPVSWPVGSDLLLKSPQRFPESPICRFCPADPLHSAAPG 



gi 

gi 
gi 
gi 



14140238 | 
17486077 j 
7019449) 
4240237) 
17445427) 



3310 3320 3330 3340 3350 

! I I i I I I I ! [ 

NOV5 COR1 00396092 PFSASEAPYPAPPASPAPYALPVAELEDVKDVPAAISTSEAAPYAPPSGL 



gi 
gi 
gi 
gi 



14140238 | 
17486077) 
7019449) 
4240237 | 
17445427) 



3360 3370 3380 3390 3400 
....|....|....|....|....|....|....|....|....|....| 
NOV5 COR1003 960 92 ESFFSNCKSLPEAPLDVAPEALGPLENSFLDGSRGLSHLGQVEPVPWADA 
gi| 14140233 j 
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gi j 17486077 | 

gi (7019449| 

gi|4240237| -- 

gitl7445427| 

5 

3410 3420 3430 3440 3450 

.| .-|. • • I — !■•••! — I — I .-I . 

WOV5 COR1003 96092 FAGPEDDLDLGPFSLPELPLQTKDAADGEAEPVEESLAPPEEMPPGAPRE 

gi|l4140238[ 

10 gi|l7486077| 

gi|7019449| 

gi|4240237l 

gi|l7445427( 

15 3460 3470 3480 3490 3500 

....|....|.. .|....|....|....|....|....|... |....| 

MOV5 COR100396092 LEPEPSGEPKLDVALEAAVEAETVPEERARGDPDSSVEPAPVPPEQLGSG 

gi | 14140238 | 

gi|l7486077| 

20 gi|7019449| 

gi|4240237| 

H gi|l7445427| 

I? 3510 3520 3530 3540 3550 

29 — I — I — I — I — I — I — I — I — I — I 

y | NOV5 COR100396092 DPS LCAPDGPAPNTVAQAQAADGAG PEDDTEASRAAAPAEG PPGQ PEAAE 

if gi [ 14140238 | 

!Jf gi 1 17486077 1 

Q" gi|7019449| 

38 gi|4240237| 

gi 1 17445427 1 

3 3560 3570 3580 3590 3600 

gj .... | .... | .... | .... | .... | .... | .... | .... | .... | .... | 

PV NOV5 COR10 0396 092 PKPTAEAPKAPREIPQRMTRNRAQMLANQSKQGPPPSEKECAPTPAPVTR 

H= gi| 14140238 | 

Li gi j 17486077 j 

gi j 7019449 | 

^ gij 4240237 j 

§3 gi|l7445427| 

rr '* 3610 3620 3630 3640 3650 

....|....|....|....|....|....|....|....|....|....| 

NOV5 COR100396092 AKARG S EDDD AQ AQH PRKRRFQRSTQQ LQLNT STQQTREVI QQTLAA I VD 

45 gi|l4140238| 

gi|l7486077| 

gi|7019449| 

gi|4240237| 

gi 17445427 I 

50 

3660 3670 3680 3690 3700 

— I — ! — I — I — I — I — i — i — I — i 

NOV5 COR100396092 AIKLDAIEPYHSDRANPYFEYLQIRKKIEEKRKILCCITPQAPQCYAEYV 

gi | 14140238 | 

55 gi|l7486077| 

gi|7019449| 

gi|4240237| 

gi|l7445427| 



60 



3710 3720 3730 3740 3750 



NOV5 COR1 00396092 TYTGS YLLDGKPLSKLHI PVI APPPSLAEPLKELFRQQEAVRGKLRLQHS 

gi|l4140238| 

gi|l7486077| 

65 gi|7019449| 

gi|4240237| 

gi]l7445427| 

3760 3770 3780 3790 3800 

70 — I — I — I — I — I — I — I — I — I — I 

NOV5 COR100396092 IEREKLIVSCEQEILRVHCRAARTIANQAVPFSACTMLLDSEVYNMPLES 
gi|l4140238| 
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3310 3820 3830 3840 3850 

...|....| . -|. --•!-• -.[-.. | I-. -I | -.-I 
NOVS COR1 0 0 3 9 6 0 9 2 QGDENKS VRDRFNARQF I S WLQD VDDKYDRMKVCLLMRQQHEAAALNA VQ 

gi|l4140238| 

gi|l7486077| 

gi[7019449[ 

gi|4240237| 

gi|l7445427| 

3860 3870 3880 3890 



NOVS COR1 003 96092 RMEWQLK VQELD PAGHKS LC VN E VP S F Y V P MVD VNDD F VLL P A 

gi|l4140238| 

gij 17486077] 

gi|7019449| 

gi|4240237| 

gi|l7445427| 

Tables 5E, 5F, 5G, 5H, 51 and 5J list the domain description from DOMAIN analysis 
results against NOV5. This indicates that the NOV5 sequence has properties similar to those of 
other proteins known to contain these domains. 



Table 5E. Domain Analysis of NOV5 

gnl| Pfamjpfa m00023, ank, Ank repeat. Ankyrin repeats generally consist of a beta, alpha, 
alpha, beta order of secondary structures. The repeats associate to form a higher order 
structure. 

(SEQ ID NO: 58) 

CD-Length = 33 residues, 84.8% aligned 
Score = 45.8 bits (107), Expect = 2e-05 



NOV 5: 218 GWTALHEACNRGYYDVAKQLLAAGAEVN 24 5 



Sbjct: 2 GNTPLHIAARNGHLEWKLLLEAGADVN 29 
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Table 5F. Domain Analysis of NOV5 

g n 1 [ P f am | p f am()002 3 , ank, Ank repeat. Ankyrin repeats generally consist of a beta, alpha, 
alpha, beta order of secondary structures. The repeats associate to form a higher order 
structure. 

(SEQ ID N0:59) 

CD-Length = 33 residues, 100.0% aligned 
Score = 43.1 bits (100), Expect = 2e-04 

NOV 5: 250 DDDTPLHDAANNGH - QWKLLLRYGGNPQQSNR 281 

I +1 I I I II III +1 I I I I I I ^ + + 
Sb j C t : 1 DGNTPLHLAARNGHLEWKLLLEAGADVNARDK 3 3 



Table 5G. Domain Analysis of NOV5 

gnl |Pfamjp fam00023, ank, Ank repeat. Ankyrin repeats generally consist of a beta, alpha, 
alpha, beta order of secondary structures. The repeats associate to form a higher order 
structure. 

(SEQ ID MO: 60) 

CD-Length = 33 residues, 93.9% aligned 
Score = 42.0 bits (97), Expect = 3e-04 

NOV 5: 185 GETRLHRAAIRGDARRIKELISEGADVNVKD 215 

I I II II I + l 1+ Mill +1 

Sbjct: 2 GNT P LHLAARNGHLE WKLLLEAGADVNARD 32 



Table 5H. Domain Analysis of NOV5 

gnl|Smart|smart00248, ANK, ankyrin repeats; Ankyrin repeats are about 33 amino 
acids long and occur in at least four consecutive copies. They are involved in protein- 
protein interactions. The core of the repeat seems to be an helix-loop-helix structure. 

(SEQ ID NO:61) 

CD-Length = 30 residues, 93.3% aligned 
Score = 43.1 bits (100), Expect = 2e-04 

NOV 5: 218 GWTALHEACNRGYYDVAKQLLAAGAEVN 24 5 
Mill I "Hi I II II++I 

Sbjct: 2 GRTPLHLAAENGNLEW KLLLDKGADIN 2 9 
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Table 51. Domain Analysis of NOV5 

gn ljSmart smart00248 , ANK, ankyrin repeats; Ankyrin repeats are about 33 amino 
acids long and occur in at least four consecutive copies. They are involved in protein- 
protein interactions. The core of the repeat seems to be an helix-loop-helix structure. 

(SEQ ID NO: 62} 

CD-Length - 30 residues, 93.3% aligned 
Score - 41.2 bits (95), Expect = 6e-04 

NOV 5: 250 DDDTPLHDAANNGH - Q WKLLLRYGGNP 276 

i Mil I! 11+ + IIIIII I + 

Sbjct: 1 DGRTPLHLAAENGNLEWKLLLDKGADI 2 8 

Table 5 J. Domain Analysis of NOV5 

gnl[Smart;smart00248 , ANK, ankyrin repeats; Ankyrin repeats are about 33 amino 
acids long and occur in at least four consecutive copies. They are involved in protein- 
protein interactions. The core of the repeat seems to be an helix-loop-helix structure. \ 

(SEQ ID NO: 63) 

CD-Length = 30 residues, 96.7% aligned 

Score = 39.3 bits (90), Expect = 0.002 

NOV 5: 185 GETRLHRAAIRGDARRIKELISEGADVNV 213 

I I I! II 1+ +11+ +IM + I + 
Sbjct: 2 GRTPLHLAAENGNLEWKLLLDKGADINL 3 0 | 

Ankyrin repeats are tandemly repeated modules of about 33 amino acids. They occur in a 
large number of functionally diverse proteins mainly from eukaryotes. The few known examples 
5 from prokaryotes and viruses may be the result of horizontal gene transfers. The conserved fold of 
the ankyrin repeat unit is known from several crystal and solution structures, e.g., from: p53- 
binding protein 53BP2, cyclin-dependent kinase inhibitor pl9Ink4d, transcriptional regulator 
GABP-beta, and NF-kappaB inhibitory protein IkB-alpha. It has has been described as an re- 
shaped structure consisting of a beta-hairpin and two alpha-helices. Many ankyrin repeat regions 
10 are known to function as protein-protein interaction domains. 

The protein similarity information, expression pattern, and map location for the NOV5 
protein and nucleic acid disclosed herein suggest that it may have important structural and/or 
physiological functions characteristic of the family. Therefore, the nucleic acids and proteins of 
the invention are useful in potential diagnostic and therapeutic applications and as a research tool. 
1 5 These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
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prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed, as well as potential therapeutic applications such as the following: (1) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) biological 
defense weapon. 

The nucleic acids and proteins of the invention are useful in potential diagnostic and 
therapeutic applications implicated in various diseases and disorders described below and/or other 
pathologies. For example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from: Cardio- vascular disorders, Cardiomyopathy, 
Atherosclerosis, Hypertension, Congenital heart defects, Aortic stenosis, Atrial septal defect 
(ASD), Atrioventricular (A-V) canal defect, Ductus arteriosus , Pulmonary stenosis , Subaortic 
stenosis, Ventricular septal defect (VSD), valve diseases, Tuberous sclerosis, Scleroderma, 
Obesity, Transplantation, Systemic lupus erythematosus , Autoimmune disease, Asthma, 
Emphysema, Scleroderma, allergy, Diabetes, Autoimmune disease, Renal artery stenosis, 
Interstitial nephritis, Glomerulonephritis, Polycystic kidney disease, Systemic lupus 
erythematosus, Renal tubular acidosis, IgA nephropathy, Hypercalcemia, Lesch-Nyhan syndrome 
and other diseases, disorders and conditions of the like. 

NOV6 

A disclosed NOV6 nucleic acid (designated as CuraGen Acc. No. COR87941483), 
encodes a novel TNF intracellular domain interacting protein and includes the 1749 nucleotide 
sequence (SEQ ID NO:21) shown in Table 6 A. An open reading frame for the mature protein 
was identified beginning with an ATG codon at nucleotides 103-105 and ending with a TAG 
codon at nucleotides 1579-1581. Putative untranslated regions downstream from the termination 
codon and upstream from the initiation codon are underlined in Table 6A, and the start and stop 
codons are in bold letters. 
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Table 6A. N0V6a Nucleotide Sequence (SEQ ID NO:21) 



AGAACGCGGAGAGTCGCCGCCTGGCCGGGCGTAGACGCGGTGGCAGAGCCCGCGCGGCGCTGGAA 

GCGAGTGGCGGAGCGGCGGGACCT CGGCGGACTCGCCATGGAGGAGGAGGGTGTGAAGGAAGCCG 

GTGAGAAGCCTCGGGGAGCACAGATGGTGGACAAGGCTGGCTGGATCAAGAAGAGCAGTGGGGGC 

CTCCTGGGTTTCTGGAAAGACCGATATCTGCTCCTCTGCCAGGCCCAGCTGCTGGTCTATGAGAATG 

AGGATGATCAGAAGTGTGTGGAGACTGTGGAGCTGGGCAGCTATGAGAAGTGCCAGGACCTTCGTG 

CCCTCCTCAAGCGAAAACACCGCTTTATCCTGCTGCGATCCCCAGGGAACAAGGTCAGCGACATCA 

AATTCCAGGCACCCACCGGGGAGGAGAAGGAATCCTGGATCAAAGCCCTCAATGAAGGGATTAAC 

CGAGGCAAAAACAAGGCTTTCGATGAGGTAAAGGTGGACAAGAGCTGCGCCCTGGAGCATGTGAC 

ACGGGACCGGGTGCGAGGGGGCCAGCGACGCCGGCCACCAACGAGAGTCCACCTGAAGGAGGTGG 

CCAGTGCAGCTTCTGACGGTCTTCTGCGCCTGGATCTTGATGTTCCGGACAGTGGGCCACCAGTGTT 

TGCCCCCAGCAATCATGTCAGTGAAGCCCAACCTCGGGAGACACCCCGGCCCCTCATGCCTCCTACC 

AAGCCTTTCCTAGCACCTGAGACCACCAGCCCTGGTGACAGGGTGGAGACCCCTGTGGGGGAGAGA 

GCCCCAACCCCTGTCTCAGCAAGCTCTGAGGTCTCCCCTGAGAGCCAAGAGGACTCAGAGACCCCA 

GCAGAGGAGGACAGTGGCTCTGAGCAGCCTCCCAACAGCGTCCTGCCTGACAAACTGAAGGTGAGC 

TGGGAGAACCCCAGCCCCCAGGAGGCCCCTGCTGCAGAGAGTGCAGAACCGTCCCAGGCACCCTGT 

TCTGAGACTTCTGAGGCTGCCCCCAGGGAGGGTGGGAAGCCCCCTACACCCCCACCCAAGATCTTA 

TCAGAAGAACACTTGAAAGCCTCCATGGGTGAGATGCAGGCTTCTGGGCCACCTGCTCCAGGCACA 

GTGAAAGGTCTCAGTCAAATGGCAAGAATGGAAGGACTGAGCATTGCCAAGCACTCTAAGGCTGA 

AGGCACCCAAAGAACTTCTCCAAAGGATGCACTAACACACCAAGCACTGCCCCCCTGGGACCTGCC 

ACCTCAGTTCCATCACCGCTGCTCCTCCCTTGGGGACTTGCTTGGGGAAGGCCCGCGGCATCCCTTG 

CAGCCCAGGCAACGGCTATATCGGGCCCAGCTGGAGGTGAAGGTGGCCTCGGAACAGACGGAGAA 

ACTGTTGAACAAGGTGCTGGGCAGTGAGCCGGCCCCTGTTAGTGCCGAAACATTGCTCAGCCAGGC 

TGTGGAGCAGCTGAGGCAGGCCACCCAGGTCCTGCAGGAAATGAGAGATTTGGGAGAGCTGAGCC 

AGGAAGCACCTGGGCTAAGGGAGAAGCGGAAGGAGCTGGTGACCCTCTACAGGAGAAGTGCACCC 

TA GGGCCTTCTGGGCCAGAGGCACCATCCCTTCTGGCCATCCATCAAGTCCATCAAGGCCCAGCCCT 

GCTGAGAAATGTGCTTCTGCTTCTACAGCAATGGCTGCAGGAGGGCCATTGGGCATGTCAGGGTTT 

GGCCATG ACCCGAAGAGACTCCTGGCGTCCTTCCTACT 

The nucleic acid sequence of NOV 6 maps to chromosome 15 and has 360 of 631 bases 
(57%) identical to a gb:GENBANK-ID:AF168676|acc:AF168676.1 mRNA from Homo sapiens 
(Homo sapiens TNF intracellular domain-interacting protein mRNA, complete cds). 

The NOV6 polypeptide (SEQ ID NO:22) is 492 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 6B. The SignalP, Psort and/or 
Hydropathy results predict that NOV6 has a signal peptide and is likely to be localized to the 
nucleus with a certainty of 0.7000. In alternative embodiments, a NOV6 polypeptide is located to 
the mitochondrial matrix spacewith a certainty of 0.1000 or the lysosome (lumen) with a certainty 
ofO.1000. 



Table 6B. Encoded NOV6 Protein Sequence (SEQ ID NO:22) 



MEEEGVKEAGEKPRGAQMVDKAGWIKKSSGGLLGFWKDRYLLLCQAQLLVYENEDDQKCVETVE 
LGSYEKCQDLRALLKRKHmLLR^^ 

DKSCALEHVTRDRVRGGQRRRPPTRVHLKEVASAASDGLLRLDLDVPDSGPPVFAPSNHVSEAQPRE 

TPRPLMPPTKPFLAPETTSPGDRVETPVGERAPTPVSASSEVSPESQEDSETPAEEDSGSEQPPNSVLPD 

KLKVSWENPSPQEAPAAESAEPSQAPCSETSEAAPREGGKPPTPPPKILSEEHLKASMGEMQASGPPA 

PGTVKGLSQMARMEGLSIAKHSKAEGTQRTSPKDALTHQALPPWDLPPQFHHRCSSLGDLLGEGPR 

HPLQPRQRLYRAQLEVKVASEQTEKLLNKVLGSEPAPVSAETLLSQAVEQLRQATQVLQEMRDLGE 

LSQEAPGLREKRKELVTLYRRSAP 



75 



The N0V6 amino acid sequence has 263 of 289 amino acid residues (91%) identical to, 
and 269 of 289 amino acid residues (93%) similar to, the 399 amino acid residue 
gjjl8027838j gblAAL55880.1 AF3 18373 1 AF3 18373 protein from Homo sapiens (Human) 
(UNKNOWN) (E = e' 102 ). 

NOV6 has homology to the amino acid sequences shown in the BLASTP data listed in 
Table 6C. 



Table 6C. BLAST results for NOV6 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi 18027838 |gb AAL5 
5880. 1__AF3 1337 3__1 
"(AF318373)"" 


unknown [Homo 
sapiens] 


287 


263/289 
(91%) 


269/289 
(93%) 


e-102 



The homology of this sequence is shown graphically in the ClustalW analysis shown in 
Table 6D. 



15 



20 



25 



30 



35 



40 



Table 6D. ClustalW Analysis of NOV6 

DNOV6 (SEQ ID NO: 22) 

2) gi|18027838 gb | AAL5588Q . 1 1 A F318373 1 (AF3183 73) unknown [Homo sapiens] (SEQ ID 
NO:64) 



30 



40 



10 20 

....|....|....|....|....|....|....|....|....|....| 
NOV6 COR8 7941483 MEEEGVKEAGEKPRGAQMVDKAGWI KKS SGGLLGFWKDRYLLLCQAQLLV 
gi|l8027838| 



50 



60 



70 



80 



90 



100 



NOV6 COR87941483 YENEDDQKCVETVELGSYEKCQDLRALLKRKHRFILLRSPGNKVSDIKFQ 
gi 1 18027838 | 



110 



130 



140 



150 



120 

....|....|....|....|....|....|....|....|....|....| 
NOV6 COR8 7941483 APTGEEKESWI KALNEGINRGKNKAFDEVKVDKSCALEHVTRDRVRGGQR 
gi|l8027838| 



160 



170 



190 



200 



180 

I I I I I I I i • ■ - - I I 

NOV6 COR87941483 RRPPTRVHLKEVASAASDGLLRLDLDVPDSGPPVFAPSNHVSEAQPRETP 
gij 18027838 | 



210 



220 



NOV6 COR87941483 RPL 
gi | 18027838 | 



230 

..|....| 



240 



250 



260 



270 



NOV6 COR87941483 
gi|l8027838| 



280 
..|.. 



290 



300 
■■I 



310 



320 
|....|.. 



330 



340 



350 
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N0V6 COR879414B3 EH KGL QMARME 

gi | 18027838 | K- Q-V VNGMDD 

360 370 380 390 400 

...,|....|....|.. .|... |. 
NOV6 COR8 79414 83 GLSI H K QR S L HQ H 
gi | 18027838 | SPEP P Q PG P T ST P 

410 420 430 440 450 

....|....|....|.-..|....|. ..|....|....|....|....| 
NOV6 COR87941483 Q 
gi|l8027838| E 

460 470 480 490 

....|....|....|....|....| ...|....|....|.. 

NOV6 COR87941483 
gi|l8027838| 

Tables 6E and 6F list the domain description from DOMAIN analysis results against 
NOV6. This indicates that the NOV6 sequence has properties similar to those of other proteins 
known to contain these domains. 







Table 6E. Domain Analysis of NOV6 


gnl (PfamjpfamOO 1 69, PR PH domain. PH stands for pleckstrin homology 

(SEQ ID NO: 65) 
CD-Length - 100 residues, 99.0% aligned 


Score 


= 57. 1 


i bits (138), Expect = le-09 


NOV 6 


19 


VDK7\GWIKKSSGGLLGFWKDRYLLLCQAQLLVYENE-DDQKCVETVELGSYEKCQDLRAL 77 
+111+11 II II I 1+ I+++ + ++ 1 


Sbjct 


1 


I VKEGWLLKKSTVKKKRWKKRYFFLFNDVLI YYKDKKKS YEPKGS I PLSGCS VEDVPDSE 60 


NOV 6 


78 


LKRKHRFILLRSPGNKVSDIKFQAPTGEEKESWIKALNEGI 118 
II + i 1 1 + II + II++ IMI+ I 


Sbjct : 


61 


FKRPNCFQLRSRDGKET- -FILQAESEEERQDWIKAIQSAI 99 
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Table 6F, Domain Analysis of NOV6 

gnl[Srnart|smartQ0233, PH, Pleckstrin homology domain.; Domain commonly found 
in eukaryotic signalling proteins. The domain family possesses multiple functions 
including the abilities to bind inositol phosphates, and various proteins. PH domains 
have been found to possess inserted domains (such as in PLC gamma, syntrophins) 
and to be inserted within other domains. Mutations in Brutons tyrosine kinase (Btk) 
within its PH domain cause X-linked agammaglobulinaemia (XLA) in patients. 
Point mutations cluster into the positively charged end of the molecule around the 
predicted binding site for phosphatidylinositol lipids. 

(SEQ ID NO: 66) 
CD-Length - 104 residues, 99.0% aligned 

Score = 57.8 bits (138), Expect = le-09 

O NOV 6: 19 VDKAGW I KKS SGGLLGFWKDRYLLLCQAQLLVYENE DDQKCVETVELGS YEKCQDLR 75 

g I I II- I i I ii ii -I II I+++ I ++ I 

2 ! Sbjcr : 1 VIKEGWLLKKSSGGKKSWKKRYFVLFNGVLLYYKSKKKKSSSKPKGSIPLSGCTVREAPD 60 

■p NOV 6: 76 ALLKRKHRFILLRSPGNKVSDIKFQAPTGEEKESWIKALNEGINR 120 

53 + +| + +| I + 1 1 + 1 1++ I + I + 

!L Sbjct: 61 SDSDKKKNCFEIVTPDRKT- -LLLQAESEEERKEWVEALRKAIAK 103 

L The protein similarity information, expression pattern, and map location for the NOV6 

protein and nucleic acid disclosed herein suggest that it may have important structural and/or 

fjj physiological functions characteristic of the family. Therefore, the nucleic acids and proteins of 
5 the invention are useful in potential diagnostic and therapeutic applications and as a research tool. 
These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed, as well as potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 

10 targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 

ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) biological 
defense weapon. 

The NOV6 nucleic acid and protein are useful in potential diagnostic and therapeutic 
applications implicated in various diseases and disorders described below and/or other 
1 5 pathologies. For example, the compositions of the present invention will have efficacy for 

treatment of patients suffering from: Cardio-vascular disorders, Cardiomyopathy, Atherosclerosis, 
Hypertension, Congenital heart defects, Aortic stenosis, Atrial septal defect (ASD), 
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Atrioventricular (A-V) canal defect, Ductus arteriosus , Pulmonary stenosis , Subaortic stenosis, 
Ventricular septal defect (VSD), valve diseases, Tuberous sclerosis, Scleroderma, Obesity, 
Transplantation, Systemic lupus erythematosus , Autoimmune disease, Asthma, Emphysema, 
Scleroderma, allergy, Diabetes, Autoimmune disease, Renal artery stenosis, Interstitial nephritis, 
Glomerulonephritis, Polycystic kidney disease, Systemic lupus erythematosus, Renal tubular 
acidosis, IgA nephropathy, Hypercalcemia, Lesch-Nyhan syndrome and other diseases, disorders 
and conditions of the like. 

The 'pleckstrin homology' (PH) domain is a domain of about 1 00 residues that occurs in a 
wide range of proteins involved in intracellular signaling or as constituents of the cytoskeleton. 
The function of this domain is not clear, several putative functions have been suggested: 

- binding to the beta/gamma subunit of heterotrimeric G proteins, 

- binding to lipids, e.g. phosphatidylinositol-4,5-bisphosphate, 

- binding to phosphorylated Ser/Thr residues, 

- attachment to membranes by an unknown mechanism. 

It is possible that different PH domains have totally different ligand requirements. The 3D 
structure of several PH domains has been determined. All known cases have a common structure 
consisting of two perpendicular anti-parallel beta sheets, followed by a C-terminal amphipathic 
helix. The loops connecting the beta-strands differ greatly in length, making the PH domain 
relatively difficult to detect. There are no totally invariant residues within the PH domain. 

Proteins reported to contain one more PH domains belong to the following families: 

- Pleckstrin, the protein where this domain was first detected, is the major substrate of 
protein kinase C in platelets. Pleckstrin is one of the rare proteins to contains two PH domains. 

- Ser/Thr protein kinases such as the Act/Rac family, the beta-adrenergic receptor kinases, 
the mu isoform of PKC and the trypanosomal NrkA family. 

- Tyrosine protein kinases belonging to the Btk/Itk/Tec subfamily. 

- Insulin Receptor Substrate 1 (IRS-1). 

- Regulators of small G-proteins like guanine nucleotide releasing factor 

NOV7 

A disclosed NOV7 nucleic acid (designated as CuraGen Acc. No. COR101 716725) 
encodes a novel secretory protein and includes the 1491 nucleotide sequence (SEQ ID NO:23) 
shown in Table 7A. An open reading frame for the mature protein was identified beginning with 
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an ATG codon at nucleotides 31-33 and ending with a TGA codon at nucleotides 1324-1326. 
Putative untranslated regions are underlined in Table 7A, and the start and stop codons are in bold 
letters. 



Table 7A. NOV7 Nucleotide Sequence (SEQ ID NO:23) 



GGGCCCGCGCAGCCCCGGCCGGAACCCACC ATGCGGCGGCTGCGGCGCCTGGCGCACCTGGTGCTC 

TTCTGCCCCTTCTCCAAGCGCCTGCAGGGCCGGCTCCCAGGCCTCAGGGTCCGCTGCATCTTCCTGG 

CCTGGCTGGGCGTCTTTGCAGGCAGCTGGCTGGTGTACGTGCACTACTCGTCCTACTCGGAGCGCTG 

TCGCGGCCATGTCTGCCAGGTGGTCATTTGTGACCAGTACCGCAAGGGGATCATCTCGGGCTCCGTC 

TGCCAGGACCTGTGTGAGCTGCATATGGTGGAGTGGAGGACCTGCCTCTCGGTGGCCCCGGGCCAG 

CAGGTGTACAGCGGGCTCTGGCGGGACAAGGATGTAACCATCAAGTGTGGCATTGAGGAGACCCTC 

GACTCCAAGGCCCGGTCGGATGCGGCCCCCCGGCGGGAGCTGGTACTGTTTGACAAGCCCACCCGG 

GGCACCTCCATCAAGGAATTCCGGGAGATGACCCTCGGCTTCCTCAAGGCGAACCTGGGAGACCTG 

CCTTCCCTGCCGGCGCTGGTTGGCCAGGTCCTGCTCATGGCTGACTTCAACAAGGACAACCGGGTGT 

CCCTGGCGGAAGCCAAGTCCGTGTGGGCCCTGCTGCAGCGTAACGAGTTCCTGCTGCTGCTGTCCCT 

GCAGGAGAAGGAGCACGCCTCCAGACTGCTGGGCTACTGTGGGGACCTCTACCTCACCGAGGGCGT 

GCCGCATGGCGCCTGGCACGCGGCCGCCCTTCCACCCCTGTTGCGCCCACTGCTGCCGCCTGCCCTG 

CAGGGTGCTCTCCAGCAGTGGCTGGGGCCTGCGTGGCCTTGGCGGGCCAAGATCGCCATCGGCCTG 

CTGGAGTTCGTGGAGGAGCTCTTCCACGGCTCTTACGGGACTTTCTACATGTGTGAGACCACACTGG 

CCAACGTGGGCTACACAGCCACCTACGACTTCAAGATGGCCGACCTGCAGCAGGTGGCACCCGAGG 

CCACCGTGCGCCGCTTCCTGCAGGGCCGCCGCTGCGAGCACAGCACCGACTGCACCTACGGGCGCG 

ACTGCAGGGCCCCGTGTGACAGGCTCATGAGGCAGTGCAAGGGCGACCTCATCCAGCCCAACCTGG 

CCAAGGTGTGCGCACTGCTACGGGGCTACCTGCTGCCTGGCGCGCCCGCCGACCTCCGCGAGGAGC 

TGGGCACACAGCTGCGCACCTGTACCACGCTGAGCGGGCTGGCCAGCCAGGTGGAGGCCCATCACT 

CGCTGGTGCTCAGCCACCTCAAGACTCTGCTCTGGAAGAAGATCTCCAACACCAAGTACTCTTGAT 

GGGGCA GT GAGGGGCCTGGCCACCCTTCCTGGAGCTGGCCAGGTGCCAGGGTCCAACCC TCCCTCA 

AGGAGAG TCCTCCAAGGGGGTTTGTTACTCTGAAGAACGTAATGTCAATAAACAGCTTTTATGTAAT 

GCCCAGGG CTGAGCACCCTGAGCCCCCATCA 



The nucleic acid sequence of NOV7 has 1 137 of 1347 bases (84%) identical to a 
gb:GENBANK-ID:AB030186|acc:AB030186.1 mRNA from Mus musculus (Mus musculus 
mRNA, complete cds, clone: 1-82). 

The NOV7 polypeptide (SEQ E) NO:24) is 431 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 7B. The SignalP, Psort and/or 
Hydropathy results predict that NOV7 has a signal peptide and is likely to be located outside of 
the cell with a certainty of 0.6615. In alternative embodiments, a NOV7 polypeptide is located to 
the microbody (peroxisome) with a certainty of 0.1215, the endoplasmic reticulum (membrane) 
with a certainty of 0. 1000, or the endoplasmic reticulum (lumen) with a certainty of 0. 1 000. The 
SignalP predicts a likely cleavage site for a NOV7 peptide between amino acid positions 59 and 
60, i.e., at the dash in the sequence CRG-HV. 
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Table 7B. Encoded NOV7 Protein Sequence (SEQ ID NO:24) 



"MRRLRRLAHLVLFCPFSKRLQGRLPGLRVRCIFLAWLGVFAGSWLVYVHYSSYSERCRGHVCQVVI 

CDQYRKGIISGSVCQDLCELHMVEWRTCLSVAPGQQVYSGLWRDKDVTIKCGIEETLDSKARSDAA 

PRRELVLFDKPTRGTSIKEFREMTLGFLKANLGDLPSLPALVGQVLLMADFNKDNRVSLAEAKSVW 

ALLQRNEFLLLLSLQEKEHASRLLGYCGDLYLTEGVPHGAWHAAALPPLLRPLLPPALQGALQQWL 

GPAWPWRAKIAIGLLEFVEELFHGSYGTFYMCETTLANVGYTATYDFKMADLQQVAPEATVRRFLQ 

GRRCEHSTDCTYGRDCRAPCDRLMRQCKGDLIQPNLAKVCALLRGYLLPGAPADLREELGTQLRTC 

TTLSGLASQVEAHHSLVLSHLKTLLW KKISNTKYS 

The NOV7 amino acid sequence has 255 of 256 amino acid residues (99%) identical to, 
and 255 of 266 amino acid residues (99%) similar to, the 266 amino acid residue 
gi|l 80278021 gblAALS5862 . 1 1 AF3 1 8355 1 AF318355 protein from Homo sapiens (Human) 
(UNKNOWN) (E - e" 136 ). 

NOV7 is expressed in at least the following tissues: Adipose, Adrenal Gland/Suprarenal 
gland, Amygdala, Aorta, Bone, Bone Marrow, Brain, Cerebral Medulla/Cerebral white matter, 
Cervix, Chorionic Villus, Colon, Coronary Artery, Dermis, Epidermis, Foreskin, Frontal Lobe, 
Heart, Hippocampus, Kidney, Liver, Lung, Lymph node, Lymphoid tissue, Mammary 
gland/Breast, Muscle, Ovary, Pancreas, Parathyroid Gland, Parotid Salivary glands, Peripheral 
Blood, Pineal Gland, Pituitary Gland, Placenta, Prostate, Respiratory Bronchiole, Retina, Skin, 
Small Intestine, Spinal Chord, Stomach, Substantia Nigra, Synovium/Synovial membrane, Testis, 
Thalamus, Thyroid, Tonsils, Umbilical Vein, Uterus and Vein. This information was derived by 
determining the tissue sources of the sequences that were included in the invention including but 
not limited to SeqCalling sources, Public EST sources, Literature sources, and/or RACE sources. 

NOV7 also has homology to the amino acid sequences shown in the BLASTP data listed 
in Table 7C. 



Table 7C. BLAST results for NOV7 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi| 13272520 gbjAAKl 
7190.1 AF3321S9 1 
(AF332189) 
pancreatitis- 


induced 
protein 49 

[Mus 
mus cuius] 


431 


382/431 
(88%) 


397/431 
(91%) 


0.0 


gi|9790C0liref |NP 0 


hypothetical 
protein 1-82 
[Mus 
musculus] 


428 


313/348 
(89%) 


322/348 
(91%) 


e-176 


62S07.1 
<NM_019833) 


gi| 18027802 gb|AAL5 


unknown [Homo 
sapiens] 


266 


255/256 
(99%) 


255/256 
(99%) 


e-136 


5862.1 AF3I8355 1 
(AF318355) 
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_?J 9T4T: "{AK013580 ) 


t-si 1 f- 0 +- T -t rp [MllQ 

pLiLaLlVc LrlLlo 

musculus] 


a 9 r 

'i ^ O 


(48%) 


iOU / 4£ J. Z 

(67%) 


c-lzl 


gi J.7433S24 ret XP 
02838? . 2 | 
(XM_028387) 


hypothetical 
protein 
XPJ328387 

[Homo 
sapiens] 


403 


194/403 
(48%) 


275/403 
(68%) 


e-119 



The homology of these sequences is shown graphically in the ClustalW analysis shown 
Table 7D. 



Table 7D. ClustalW Analysis of NOV7 

1) NOV7 (SEQ ID NO: 24) 

2) 91 ,13272520 qb j AAK1719Q . 1 1 AF33 2189 1 (AF332189) pancreatitis-induced protein 49 
[Mus musculus] (SEQ ID NO: 67) 

3 > gi [ 9790001 j re f j NP 062807. l| (NM_019833) hypothetical protein 1-82 [Mus musculus] 
(SEQ ID "NO : 68)" 

4 > gi |l8027802 |gbl AAL5 5862 .1 ■ AF3183 55 1 (AF3 18355} unknown [Homo sapiens] 
(SEQ ID NO: 69) 

5) gi[ 128509 97 dbj BAB28914.1; (AK013580) putative [MUS musculus] (SEQ ID 
NO: 70) 

6) gi!l7433824 re£ XP 028387.2 (XM_028387) hypothetical protein XP_028387 [Homo 
sapiens] (SEQ ID NO: 71) 

10 20 30 40 50 

■I 



NOV7 COR101716725 MRRLRRLAHLVLFCPFSKRLQGRLPGLRVRCIFLA GV A LV H 



gi I 13272520 j 
gi (9790001) 
gi j 18027802) 
gi j 12850997 j 
gi j 17433824 j 



MRRLRRLVHLVLLCPFSKGLQGRLPGLRVKYVLLV GI V MV H 
m H 

MARSLCAGAWLRKPHYLQARLS YMRVKYLFFS W V HQ 

MKYLFFS W V II Q 



NOV7 COR101716725 

gi 
gi 
gi 



13272520) 
9790001 j 
18027802] 
12850997 j 
gi) 17433824 j 



91 



T T 
T T 



60 



•I- 



70 



80 



90 



S S R HV QW Q RK I S SV QD ELHM 
S S HV QW Q RK I S SV QD ELQK 
S S HV QW Q RK I S SV QD BLQK 
M 



100 



V 

s 
s 

V 



KD KKI K KT V D PA NS VTETLYFGK NK S 
KD KKI K KT V D PA NS VTETLYFGK TK N 



110 



120 



130 



140 



150 



NOV7 COR101716725 
gi|l3272520| 
gij 9790001) 
gij 18027802 [ 
gij 12850997 j 
gi|l7433824 | 



N M 
N M 



R D 
Q E 
Q E 
R D 
DNLPGW 
DNLPGW 



D 
N 
N 
D 



RS 
WP 
WP 
RS 



QM Q 
QM Q 



HLDFGTELE 
HLDFGTELE 



TVQ 
TVQ 



160 



170 



180 



190 



200 



NOV7 COR101716725 
gi 1 13272520 | 
gij 9790001 1 
gi|l8027802| 
gij 12850997 j 
gi j 17433824 j 



K K 
K K 



G 

VY LF 
VY LF 



QGN SE 
QGN SE 



G V 

D 

D 

G V 

NL TV 
NL TV 



N 
S 
S 

N 

GDR GQ 
GD GQ 



A 
A 



210 



220 



230 



240 



250 



NOV7 COR101716725 
gi|l3272520| 



A A 
V 
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L 

L A 



gi 1 9790001 1 I V L A 

gi|l8027802 | AAA L 

gi | 12850997 1 L MVI D TPK M F VM S EYT LY IS WVMEL 

gi [ 17433824 j L MVI D TPK M F VM S EYT LY IS WVIEL 

260 270 280 290 300 

....|... ! . . -|....| ...i. I- - ! 

NOV7 COR101716725 PA QG L 

gi|l3272520| V H 

gij979000l| V H 

gij 18027802| PA QG L 

gi | 12850997) FI GFR SMD L T S RK DV P N L D SA 

gi | 17433 824 | FI GFR SMD L T S RK DV P N L D SA 



310 320 330 340 350 

....|....|....|....|....|....|....|..-.|....|....| 

NOV7 COR101716725 R H T T R AP 

gi]l3272520| Q S I R AP 

gij 9790001 1 Q S I R AP 

gi|l8027802| R H T TT 

gij 12850997 j K L NEK L V MRKIV TNLKELIKD SDL V T TS 

gi | 17433824 | K L NDK L V MRKIV TNLKELIKD SDL V T TS 



360 



370 



NOV7 COR101716725 RLMRQ KGDL V A RG 

gi | 13272520 | RLMRQ KGDL V E R 

gi) 9790001 [ RLMRQ KGDL V E R 

gij 18027802 | TAGPRVTGS- 

gi | 12850997 | LSTMK TSEV A Q K 

gi|l7433824| QSTMK TSEV A Q K 



380 

..|....|.. 
P ADLR 
P AGLY 
P AGLY 



390 



SEIR 
SEIR 



400 



GT RT TT S 
G RT TT S 
G CAPAPQKV 



E YS IA K 
E YS IA K 



410 420 430 440 450 

— i — I — i — I — i — ! — l — l — i — I 

N0V7 COR101716725 GL S V AH V SB KK N KY 

gi 1 13272520 1 GL S I AH V SH RE N NY 

gij 9790001 | DWPARLRLTIHWC AT RPYSGGRSPTPTTPRAAGSRHY SQVAPPHSLQ 

gi|l8027802| 

gij 12850997 ( VTNMME INN KK Y ND 

gi 1 17433824 j VT N M MS I UN KK Y ND 



460 470 
....|....|....|....|.. 

NOV7 COR101716725 

gi|l3272520| 

gi|979000l| QLSRGARGP YQRWPTGPNP PNM 

gijl8027802| 

gi|l2850997| 

gi|l7433824| 



Many calcium-binding proteins belong to the same evolutionary family and share a type of 
calcium-binding domain known as the EF-hand. This type of domain consists of a twelve residue 
loop flanked on both side by a twelve residue alpha-helical domain. In an EF-hand loop the 
calcium ion is coordinated in a pentagonal bipyramidal configuration. The six residues involved 
in the binding are in positions 1, 3, 5, 7, 9 and 12; these residues are denoted by X, Y> Z, -Y, -X 
and -Z. The invariant Glu or Asp at position 12 provides two oxygens for liganding Ca (bidentate 
ligand). 

The protein similarity information, expression pattern, and map location for the NOV7 
protein and nucleic acid disclosed herein suggest that it may have important structural and/or 
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physiological functions characteristic of the EF-hand family. Therefore, the nucleic acids and 
proteins of the invention are useful in potential diagnostic and therapeutic applications and as a 
research tool. These include serving as a specific or selective nucleic acid or protein diagnostic 
and/or prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to 
5 be assessed, as well as potential therapeutic applications such as the following: (i) a protein 

therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) biological 
defense weapon. 

D3 The NOV7 nucleic acid and protein are useful in potential diagnostic and therapeutic 

m applications implicated in various diseases and disorders described below and/or other 

as ^ 

l!;; pathologies. For example, the compositions of the present invention will have efficacy for 

4S treatment of patients suffering from: Cardio-vascular diseases, Cardiomyopathy, Atherosclerosis, 

34 IS. 

Hypertension, Congenital heart defects, Aortic stenosis, Atrial septal defect (ASD), 
II Atrioventricular (A-V) canal defect, Ductus arteriosus , Pulmonary stenosis , Subaortic stenosis, 
\a Ventricular septal defect (VSD), valve diseases, Tuberous sclerosis, Scleroderma, Obesity, 
%% Transplantation, Systemic lupus erythematosus , Autoimmune disease, Asthma, Emphysema, 
Pi Scleroderma, allergy, Diabetes, Autoimmune disease, Renal artery stenosis, Interstitial nephritis, 

Glomerulonephritis, Polycystic kidney disease, Systemic lupus erythematosus, Renal tubular 
20 acidosis, IgA nephropathy, Hypercalcemia, Lesch-Nyhan syndrome and other diseases, disorders 

and conditions of the like. 

NOV8 

NOV8 includes two GPCR-like proteins. They have been designated NO V8a and 
NOV8b. 

25 NOV8a 

A disclosed NOV8a nucleic acid (designated as CuraGen Acc. No. CG56663-01), encodes 
a novel GPCR-like protein and includes the 1062 nucleotide sequence (SEQ ID NO:25) shown in 
Table 8A. An open reading frame for the mature protein was identified beginning with an ATG 
codon at nucleotides 10-12 and ending with a TAA codon at nucleotides 948-950. Putative 
30 untranslated regions are underlined in Table 8 A, and the start and stop codons are in bold letters. 

84 



Table 8A. NOV8a Nucleotide Sequence (SEQ ID NO:25) 



TAGAGATGG ATGGAACCAATGGCAGCACCCAAACCCATTTCATCCTACTGGGATTCTCTGACCGAC 

CCCATCTGGAGAGGATCCTCTTTGTGGTCATCCTGATCGCGTACCTCCTGACCCTCGTAGGCAACAC 

CACCATCATCCTGGTGTCCCGGCTGGACCCCCACCTCCACACCCCCATGTACTTCTTCCTCGCCCACC 

TTTCCTTCCTGGACCTCAGTTTCACCACCAGCTCCATCCCCCAGCTGCTCTACAACCTTAATGGATGT 

GACAAGACCATCAGCTACATGGGCTGTGCCATCCAGCTCTTCCTGTTCCTGGGTCTGGGTGGTGTGG 

AGTGCCTGCTTCTGGCTGTCATGGCCTATGACCGGTGTGTGGCTATCTGCAAGCCCCTGCACTACAT 

GGTGATCATGAACCCCAGGCTCTGCCGGGGCTTGGTGTCAGTGACCTGGGGCTGTGGGGTGGCCAA 

CTCCTTGGCCATGTCTCCTGTGACCCTGCGCTTACCCCGCTGTGGGCACCACGAGGTGGACCACTTC 

CTGCGTGAGATGCCCGCCCTGATCCGGATGGCCTGCGTCAGCACTGTGGCCATCGAAGGCACCGTC 

TTTGTCCTGAAAAAAGGTGTTGTGCTGTCCCCCTTGGTGTTTATCCTGCTCTCTTACAGCTACATTGT 

GAGGGCTGTGTTACAAATTCGGTCAGCATCAGGAAGGCAGAAGGCCTTCGGCACCTGCGGCTCCCA 

TCTCACTGTGGTCTCCCTTTTCTATGGAAACATCATCTACATGTACATGCAGCCAGGAGCCAGTTCTT 

CCCAGGACCAGGGCATGTTCCTCATGCTCTTCTACAACATTGTCACCCCCCTCCTCAATCCTCTCATC 

TACACCCTCAGAAACAGAGAGGTGAAGGGGGCACTGGGAAGGTTGCTTCTGGGGAAGAGAGAGCT 

AGGAAAGGAGTAA AGGCATCTCCACCTGACTTCACTTCCATCCAGGGCCACTGGCAGCATCTGGAA 

CGGCTGAATTCCAGCTGATATTAGCCCACGACTCCCAACTTGCCTTTTTCTGGACTTTT 



The NOV8a polypeptide (SEQ ID NO:26) is 314 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 8B. 



Table 8B. Encoded NOV8a Protein Sequence (SEQ ID NO:26) 



MDGTNGSTQTHFILLGFSDRPHLERILFVVILIAYLLTLVGNTTIILVSRLDPHLHTPMYFFLAHLSFLD 
LSFTTSSIPQLLYNLNGCDKTISYMGCAIQLFLFLGLGGVECLLLAVMAYDRCVAICKPLHYMVIMNP 
RLCRGLVSVTWGCGVANSLAMSPVTLRLPRCGHHEVDHFLREMPALIRMACVSTVAIEGTVFVLKK 
GVVLSPLVFILLSYSYIVRAVLQIRSASGRQKAFGTCGSHLTVVSLFYGNIIYMYMQPGASSSQDQGM 
FLMLFYNIVTPLLNPLIYTLRNREVKGALGRLLLGKRELGKE 



NOV8b 

A disclosed NOV8b nucleic acid (designated as CuraGen Acc. No. CG56663-02), which 
is a variant of NOV8a ? includes the 1062 nucleotide sequence (SEQ ID NO:27) shown in Table 
8C. An open reading frame for the mature protein was identified beginning with an ATG codon 
at nucleotides 6-8 and ending with a TAA codon at nucleotides 948-950. The start and stop 
codons of the open reading frame are highlighted in bold type. Putative untranslated regions are 
underlined and found upstream from the initiation codon and downstream from the termination 
codon. 
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Table 8C. NOV8b Nucleotide Sequence (SEQ ID NO:27) 



TAGAGATGGATGGAACCAATGGCAGCACCCAAACCCATTTCATCCTACTGGGATTCTCTGAC 

CGACCCCATCTGGAGAGGATCCTCTTTGTGGTCATCCTGATCGCGTACCTCCTGACCCTCGTA 

GGCAACACCACCATCATCCTGGTGTCCCGGCTGGACCCCCACCTCCACACCCCCATGTACTT 

CTTCCTCGCCCACCTTTCCTTCCTGGACCTCAGTTTCACCACCAGCTCCATCCCCCAGCTGCTC 

TACAACCTTAATGGATGTGACAAGACCATCAGCTACATGGGCTGTGCCATCCAGCTCTTCCT 

GTTCCTGGGTCTGGGTGGTGTGGAGTGCCTGCTTCTGGCTGTCATGGCCTATGACCGGTGTGT 

GGCTATCTGCAAGCCCCTGCACTACATGGTGATCATGAACCCCAGGCTCTGCCGGGGCTTGG 

TGTCAGTGACCTGGGGCTGTGGGGTGGCCAACTCCTTGGCCATGTCTCCTGTGACCCTGCGCT 

TACCCCGCTGTGGGCACCACGAGGTGGACCACTTCCTGCGTGAGATGCCCGCCCTGATCCGG 

ATGGCCTGCGTCAGCACTGTGGCCATCGACGGCACCGTCTTTGTCCTGGCGGTGGGTGTTGT 

GCTGTCCCCCTTGGTGTrTATCCTGCTCTCTTACAGCTACATTGTGAGGGCTGTGTTACAAAT 

TCGGTCAGCATCAGGAAGGCAGAAGGCCTTCGGCACCTGCGGCTCCCATCTCACTGTGGTCT 

CCCTTTTCTATGGAAACATCATCTACATGTACATGCAGCCAGGAGCCAGTTCTTCCCAGGAC 

CAGGGCATGTTCCTCATGCTCTTCTACAACATTGTCACCCCCCTCCTCAATCCTCTCATCTAC 

ACCCTCAGAAACAGAGAGGTGAAGGGGGCACTGGGAAGGTTGCTTTTGGGGAAGAGAGAGC 

TAGGAAAGGAGTAA AGGCATCTCCACCTGACTTCACTTCCATCCAGGGCCACTGGCAGCATC 

TGGAACGGCTGAATTCCAGCTGATATTAGCCCACGACTCCCAACTTGCCTTTTTCTGGACTTT 

T 



A N0V8b polypeptide (SEQ ID NO:28) is 314 amino acid residues and is presented using 
the one letter code in Table 8D. 



Table 8D. Encoded NOV8b Protein Sequence (SEQ ID NO:28) 

MDGTNGSTQTHFILLGFSDRPHLERILFVVILIAYLLTLVGNTTIILVSRLDPHLHTPMYFFLAHLSFLDLSF 
TTSSIPQLLYNLNGCDKTISYMGCAIQLFLFLGLGGVECLLLAVMAYDRCVAICKPLHYMVIMNPRLCR 
GLVSVTWGCGVANSLAMSPVTLRLPRCGHHEVDHFLREMPALIRMACVSTVAIDGTVFVLAVGVVLSP 
LVFILLSYSYIVRAVLQIRSASGRQKAFGTCGSHLTWSLFYGNIIYMYMQPGASSSQDQGMFLMLFYNI 
VTPLLNPLIYTLRNREVKGALGRLLLGKRELGKE 

The nucleic acid sequence of NOV 8 has 600 of 710 bases (84%) identical to a 
gb:GENBANK-ID:AX008326|acc:AX008326.1 mRNA from Marmota marmota (Sequence 24 
from Patent W09967282). 

A NOV8 amino acid sequence has 314 of 314 amino acids (100%) identical to, and 314 of 
314 amino acids (100%) similar to, a giil7445344[refjXP 060558.1( XM_060558 protein from 
Homo sapiens (Human) (similar to OLFACTORY RECEPTOR) (E - e~ 164 ). 

NOV8 is expressed in at least the following tissues: Apical microvilli of the retinal 
pigment epithelium, arterial (aortic), basal forebrain, brain, Burkitt lymphoma cell lines, corpus 
callosum, cardiac (atria and ventricle), caudate nucleus, CNS and peripheral tissue, cerebellum, 
cerebral cortex, colon, cortical neurogenic cells, endothelial (coronary artery and umbilical vein) 
cells, palate epithelia, eye, neonatal eye, frontal cortex, fetal hematopoietic cells, heart, 
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hippocampus, hypothalamus, leukocytes, liver, fetal liver, lung, lung lymphoma cell lines, fetal 
lymphoid tissue, adult lymphoid tissue, Those that express MHC II and III nervous, medulla, 
subthalamic nucleus, ovary, pancreas, pituitary, placenta, pons, prostate, putamen, serum, skeletal 
muscle, small intestine, smooth muscle (coronary artery in aortic) spinal cord, spleen, stomach, 
5 taste receptor cells of the tongue, testis, thalamus, and thymus tissue. This information was 
derived by determining the tissue sources of the sequences that were included in the invention 
including but not limited to SeqCalling sources, Public EST sources, Literature sources, and/or 
RACE sources. 

NOV8a and NOV8b are very closely homologous as is shown in the amino acid alignment 
ft in Table 8E. 

g Table 8E. Amino Acid Alignment of NOV8a and NO V8b 

10 20 30 40 50 

i'; — t — i — I — I — i — I — i — I — I — I 

y 5 NOV8a CG56663-01 

NOV8b CG56663-02 



2S 



30 



35 



40 



45 



NOVSa CG56663-01 
NOV8b CG56663-02 



NOV8a CG56663-01 
NOV8b CG56663-02 



60 70 80 90 100 



110 120 130 140 150 



160 170 180 190 200 



NOV8a CG56663-01 E 
NOV8b CG56663-02 D 



NOV8a CG56663-01 KK 
NOV8b CG56663-02 AV 



NOV8a CG56663-01 
WOV8b CG56663-02 



NOV8a CG56663-01 
NOV8b CG56663-02 



210 220 230 240 250 



260 270 280 290 300 



310 



Homologies to any of the above NOV8 proteins will be shared by the other NOV8 
proteins insofar as they are homologous to each other as shown above. Any reference to NOV8 is 
assumed to refer to both of the NOV8 proteins in general, unless otherwise noted. 
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The SignalP, Psort and/or Hydropathy results predict that a NOV8 has a signal peptide and 
is likely to be localized to the plasma membrane with a certainty of 0.6000. In alternative 
embodiments, a NOV8 polypeptide is located to the Golgi body with a certainty of 0.4000, the 
endoplasmic reticulum (membrane) with a certainty of 0.3000, or th e microbody (peroxisome) 
5 with a certainty of 0.3000. The SignalP predicts a likely cleavage site for a NOV8 peptide 
between amino acid positions 41 and 42, i.e., at the dash in the sequence LVG-NT. 

NOV8a also has homology to the amino acid sequences shown in the BLASTP data listed 
in Table 8F. 





Table 8F. BLAST results for NOV8a 




Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 




gi| 17445344 .ref XP 


similar to 


314 


314/314 


314/314 


e-164 


J™ 


060558. 1| 
(XM_060558) 


olfactory 
receptor (H. 
sapiens) 

[Homo 
sapiens] 




(100%) 


(100%) 




asss 

Mi 


gi|5901478|gb|AAD55 
304 .1|AF044033 1 
(AF044033) 


olfactory 
receptor 
[Marmot a 
marmota] 


237 


194/237 
(81%) 


215/237 
(89%) 


2e-99 


01 


gi 1 13624329 ref NP 


olfactory 


320 


184/305 


236/305 


le-94 


"BB- 
SS £ 


112165. 1| 
(NM_03Q903) 


receptor, 
family 2, 
subfamily W, 
member 1 

[Homo 
sapiens] 




(60%) 


(77%) 






gi i 12054431 emb CAC 


olfactory 


320 


184/305 


236/305 


le-94 




20523.1 (AJ302603) 


receptor 

[Homo 
sapiens] 




(60%) 


(77%) 






gi 12054429' emb! CAC 
20522.1 (AJ302602) 


olfactory 
receptor 

[Homo 
sapiens] 


320 


184/305 
(60%) 


236/305 
(77%) 


2e-94 
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The homology of these sequences is shown graphically in the ClustalW analysis shown in 
Table 8G. 

Table 8G. ClustalW Analysis for NOV8a 

1) NOV8a (SEQ ID NO:26) 

2) NOV8b (SEQ ID NO: 28) 

3) gi 1 27445344 ref XP C60558 . 1 (XM__060558) similar to olfactory receptor (H. 
sapiens) [Homo sapiens] (SEQ ID NO: 72) 

4) gi 1 5901478 |gb,AAD55304 .1 . AFQ44033 1 (AF044033) olfactory receptor [Marmota 
marmota] (SEQ ID NO: 73) 

5) g i j 1 3 6 2 4 3 29 ^ ref _ JSTP_1_12_1 6 5 . JL _ (NM_030903) olfactory receptor, family 2, subfamily 
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10 



15 



15 



W, member 1 [Homo sapiens] (SEQ ID NO: 74) 

6 ) g i f 1 2 0 5 4 4 3 1 e-no _ C AC 2 0 523 . 1 
ID NO:75) 

7) gi 12054429 emb CAC20522.1 
NO:76r" 



(AJ302603) olfactory receptor [Homo sapiens] (SEQ 
(AJ302602) olfactory receptor [Homo sapiens] (SEQ ID 



10 



20 
• \ • • 



30 



40 



I 



NOV8a Cura 559 CG56663-01 

NOV8b Cura- 5593 CG56663-02 

gi|l7445344| 

gi|5901478| 

gi jl3624329[ 

gi [12054431 | 

gi|l2054429| 



NOV8a Cura 559 CG56663-01 

NOV8b Cura~559B CG56663-02 

gijl7445344| 

gi|5901478| 

gijl3624329| 

gi|l205443l| 

gij 12054429) 



GT G TQTH 
GT G TQTH 
GT G TQTH 



DR HL R 
DR HL R 
DR HL R 



FV IL A 
FV IL A 
FV IL A 



QS Y SLHG 
QS Y SLHG 
QS Y SLHG 

60 



NH KM M 
NH KM M 
NH KM M 

70 



SG VA P 
SG VA F 
SG VA F 

80 



A 
A 
A 



90 
.|....| 



110 



120 



130 



140 



50 
...| 

V R 

V R 

V R 



A L 
A L 
A L 

100 
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PH 


AH 
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PH 


AH 
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L Y 
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L GN 
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L H 
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SQ 


RN 
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M V 
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I 


SG 


RN 
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V V 


w 


P 


V 


I 


SQ 


RN 


c 


I 


M V 


w 


P 


V 


I 



150 



NOVSa Cura 559 CG56663-01 


FLFLG 


G 


A 


CV 


M 


I 


R 


RGLVSVT 


G 


NOV8b Cura-559B CG56663-02 


FLFLG 


G 


A 


cv 


M 


I 


R 


RGLVSVT 


G 


gi | 17445344 | 


FLFLG 


G 


A 


CV 


M 


I 


R 


RGLVSVT 


G 


gi (5901478 | 


FLFLG 


G 


A 


FV V 


T 


I 


SSR 


LGLVSVA 


G 


gij 13624329 | 


YVYMW 


S 


S 


FT 


F 


V 


H 


LKMIIMI 


S 


gij 12054431 | 


YVYMW 


s 


s 


FT 


F 


V 


H 


LKMIIMI 


S 


gij 12054429 | 


YVYMW 


s 


s 


FT 


F 


V 


H 


LKMIIMI 


S 



160 



170 



180 



190 



200 



M 



40 



45 



50 



55 



60 



NOV8a Cura 559 CG56663-01 


CGV 


LAMSPV 


R 


R 


HHEV 


R 


M 


IRM 


S 


VAI GT 


NOV8b Cura-559B CG56663-02 


CGV 


LAMSPV 


R 


R 


HHEV 


R 


M 


IRM 


s 


VAIDGT 


gi | 17445344 | 


CGV 


LAMSPV 


R 


R 


HHEV 


R 


M 


IRM 


s 


VAI GT 


gij 5901478 | 


CGM 


LVMSPV 


Q 


R 


HNKV 


C 


M 


IRM 


N 


VAI GT 


gill3624329| 


ISL 


WLCTL 


N 


T 


NNIL 


C 


L 


VKI 


D 


TTV MS 


gi jl2054431 1 


ISL 


WLCTL 


N 


T 


NNIL 


C 


L 


VKI 


D 


TTV MS 


gi|l2054429| 


ISL 


WLCTL 


N 


T 


NNIL 


c 


L 


VKI 


D 


TTV MS 



210 



220 



230 



240 
..|.. 



260 



270 



280 



290 



250 



NOV8a Cura 559 CG56663-01 


V 


KKGV 


s 


VF 


L 


S 


VR 


QIR ASGRQ FG 


L 


NOV8b Cura-559B CG56663-02 


V 


AVGV 


s 


VF 


L 


S 


VR 


QIR ASGRQ FG 


L 


gi| 17445344 | 


V 


KKGV 


s 


VF 


L 


s 


VR 


QIR ASGRQ FG 


L 


gij 5901478 | 


V 


AVGI 


s 


VF 


V 


GH 


VR 


FRIQ SSGRHRIFN 


L 


gijl3624329| 


A 


GUI 


T 


IL 


I 


G 


AK 


RTK KASQR MN 


M 


gi | 12054431 j 


A 


GUI 


T 


IL 


I 


G 


AK 


RTK KASQR MN 


M 


gi|l2054429| 


A 


giii 


T 


IL 


I 


G 


AK 


RTK KASQR MN 


M 



300 



NOV8a Cura 559 CG56663-01 


N 


M 


ASS 


Q 


M 


M 


NIV 


L 


REV 


G 


NOV8b Cura-559B CG56663-02 


N 


M 


ASS 


Q 


M 


M 


NIV 


L 


REV 


G 


gi|l7445344 | 


N 


M 


ASS 


Q 


M 


M 


NIV 


L 


REV 


G 


gij 5901478 | 


N 


M 


SRS 


Q 


K 


T 


NIV 


L 


F S 




gij 13624329| 


T 


L 


NRA 


K 


K 


T 


TVI 


S 


KDM 


D 


gijl205443l| 


T 


L 


NRA 


K 


K 


T 


TVI 


s 


KDM 


D 


gi| 12054429 j 


T 


L 


NRA 


K 


K 


T 


TVI 


s 


KNM 


D 






310 




320 













NOV8a Cura 559 CG56663-01 GR LLGKRELG E- 

NOV8b Cura-559B CG56663-02 GR LLGKRELG E- 

65 gi | 17445344 | GR LLGKRELG E- 

gij5901478| 
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gi | 13624329) 
gi 1 12054431] 
gijl2054429| 



KK MRFHHKST IKRNCKS 
KK MRFHHKST IKRNCKS 
KK MRFHHKST IKRNCKS 



Table 8H lists the domain description from DOMAIN analysis results against NOV 8. 
This indicates that the NOV8 sequence has properties similar to those of other proteins known to 
contain these domains. 







Table 8H. Domain Analysis of NOV8 




CTl.Pfam pfamOOOOl , 7tm 1, 7 transmembrane receptor (rhodopsin family). 




(SEQ ID NO: 
CD-Length = 


77) 

254 residues, 100.0% aligned 




Scoire 


— 3D . J. 


bits (235) , Expect = 5e-21 




JMUV o 


4t L 


GNTTIILVSRLDPHLHTPMYFFLAHLSFLDLSFTTSSIPQLLYNLNGCDKTISYMGCAIQ 

II +111 ill 1 1 +1+ III + 1 Mill I + 


100 


Sbjct 


1 


GNLLVI LVI LRTKKLRTPTNI FLLNLAVADLLFLLTLPPWALYYLVGGDWVFGDALCKLV 


60 


NOV 8 


101 


LFLFLGLGGVECLLLAVMAYDRCVAICKPLHYMVIMNPRLCRGLVSVTWGCGVANSLAMS 

IK ! Ill ++11+11 Ml 1 li + 1+ + 1 + II 


160 


Sbjct 


61 


GALFWNGYASI LLLTAI S IDRYLAIVHPLRYRRIRTPRRAKVL1LLVWVLALLLSLP - - 


118 


NOV 8 


161 


O O 0 

PVTIiRLPRCGHHEVDHFLREMPALIRMACVSTVAIEGTVFVLKKGWLSPLVFILLSYSY 
|+ 1 + IN 11+ 11+ !+ 


220 


Sbjct 


119 


PLLFSWLRTVEEGNTTVCLIDFPEESVKRSYVLLSTLVGFVL PLLVI LVCYTR 


171 


NOV 8 


221 


I VRAV LQI RS ASGRQKAFGTCGSHLTWSLFYG NI I YMYMQPGAS SS 

i+i + 1+ ll+l hi + 1 + 


267 


Sbjct 


172 


I LRTLRKRARSQRS LKRRS S SERKAAKMLLVWWFVLCWLP YH I VLLLDSLCLLS I WRV 


231 


NOV 8 


268 


QDQGMFLMLFYNIVTPLLNPLIY 290 
+ + 1+ 1 Ml+ll 




Sbjct 


232 


LPTALLI TLWLAYVNS CLNPI I Y 254 





G-Protein Coupled Receptor (GPCRs) have been identified as extremely large subfamily 
of G protein-coupled receptors in a number of species. These receptors share a seven 
transmembrane domain structure with many neurotransmitter and hormone receptors, and are 
likely to underlie the recognition and G-protein-mediated transduction of various signals. 
Previously, GPCR genes cloned in different species were from random locations in the respective 
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genomes. The human GPCR genes are intron less and belong to four different gene subfamilies, 
displaying great sequence variability. These genes are dominantly expressed in olfactory 
epithelium. 

Olfactory receptors (ORs) have been identified as extremely large subfamily of G protein- 
coupled receptors in a number of species. These receptors share a seven transmembrane domain 
structure with many neurotransmitter and hormone receptors, and are likely to underlie the 
recognition and G-protein-mediated transduction of odorant signals. Previously, OR genes cloned 
in different species were from random locations in the respective genomes. The human OR genes 
are intron less and belong to four different gene subfamilies, displaying great sequence variability. 
These genes are dominantly expressed in olfactory epithelium. 

The protein similarity information, expression pattern, and map location for the NOV8 
proteins and nucleic acids disclosed herein suggest that it may have important structural and/or 
physiological functions characteristic of the GPCR family. Therefore, the nucleic acids and 
proteins of the invention are useful in potential diagnostic and therapeutic applications and as a 
research tool. These include serving as a specific or selective nucleic acid or protein diagnostic 
and/or prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to 
be assessed, as well as potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) biological 
defense weapon. 

The NOV8 nucleic acid and protein are useful in potential diagnostic and therapeutic 
applications implicated in various diseases and disorders described below and/or other 
pathologies. For example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from: developmental diseases, MHCII and III diseases (immune 
diseases), Taste and scent detectability Disorders, Burkitt r s lymphoma, Corticoneuro genie disease, 
Signal Transduction pathway disorders, Retinal diseases including those involving 
photoreception, Cell Growth rate disorders; Cell Shape disorders, Feeding disorders;control of 
feeding; potential obesity due to over-eating; potential disorders due to starvation (lack of apetite), 
noninsulin-dependent diabetes mellitus (NIDDM1), bacterial, fungal, protozoal and viral 
infections (particularly infections caused by HIV-1 or HIV-2), pain, cancer (including but not 
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limited to Neoplasm; adenocarcinoma; lymphoma; prostate cancer; uterus cancer), anorexia, 
bulimia, asthma, Parkinson's disease, acute heart failure, hypotension, hypertension, urinary 
retention, osteoporosis, Crohn's disease; multiple sclerosis; and Treatment of Albright Hereditary 
Ostoeodystrophy, angina pectoris, myocardial infarction, ulcers, asthma, allergies, benign 
prostatic hypertrophy, and psychotic and neurological disorders, including anxiety, schizophrenia, 
manic depression, delirium, dementia, severe mental retardation. Dentatorubro-pallidoluysian 
atrophy(DRPLA) Hypophosphatemic rickets, autosomal dominant (2) Acrocallosal syndrome and 
dyskinesias, such as Huntington's disease or Gilles de la Tourette syndrome and/or other 
pathologies and disorders of the like.. The polypeptides can be used as immunogens to produce 
antibodies specific for the invention, and as vaccines. They can also be used to screen for 
potential agonist and antagonist compounds. For example, a cDNA encoding the OR -like protein 
may be useful in gene therapy, and the OR-like protein may be useful when administered to a 
subject in need thereof. By way of nonlimiting example, the compositions of the present 
invention will have efficacy for treatment of patients suffering from bacterial, fungal, protozoal 
and viral infections (particularly infections caused by HIV-1 or HIV-2), pain, cancer (including 
but not limited to Neoplasm; adenocarcinoma; lymphoma; prostate cancer; uterus cancer), 
anorexia, bulimia, asthma, Parkinson's disease, acute heart failure, hypotension, hypertension, 
urinary retention, osteoporosis, Crohn's disease; multiple sclerosis; and Treatment of Albright 
Hereditary Ostoeodystrophy, angina pectoris, myocardial infarction, ulcers, asthma, allergies, 
benign prostatic hypertrophy, and psychotic and neurological disorders, including anxiety, 
schizophrenia, manic depression, delirium, dementia, severe mental retardation and dyskinesias, 
such as Huntington's disease or Gilles de la Tourette syndrome and/or other pathologies and 
disorders. The novel nucleic acid encoding OR-like protein, and the OR-like protein of the 
invention, or fragments thereof, may further be useful in diagnostic applications, wherein the 
presence or amount of the nucleic acid or the protein are to be assessed. These materials are 
further useful in the generation of antibodies that bind immunospecifically to the novel substances 
of the invention for use in therapeutic or diagnostic methods.and other diseases, disorders and 
conditions of the like. 

NOV9 

A disclosed NOV9 is nucleic acid (designated as CuraGen Acc. No. CG56787-01, encodes 
a novel dual specificity phosphatase and includes the 624 nucleotide sequence (SEQ ID NO:29) 
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shown in Table 9 A. An open reading frame for the mature protein was identified beginning at 
nucleotide 1 and ending with a TAA codon at nucleotides 805-807. Putative untranslated regions 
downstream from the termination codon are underlined in Table 9 A, and the stop codon is in bold 
letters. 



Table 9 A- NOV9 Nucleotide Sequence (SEQ ID NO:29) 



CTTTGAGCTTCTCTGACTGCTGACCACTGACCCACCGACTTGATGACAGCACCCTCGTGTGCCTTCC 

CAGTTCAAATCCGGCAGCCCTCAGTCAGCGGCCTCTCGCAGATAACCAAAAGCCTGTATATCAGCA 

ATGGTGTGGCCGCCAACAACAAGCTCATGCTGTCTAGCAACCAGATCACCATGGTCATCAATGTCTC 

AGTGGAGGTAGTGAACACCTTGTATGAGGATATCCAGTACATGCAGGTACCTGTGGCTGACTCCCC 

TAACTCACGTCTCTGTGACTTCTTTGACCCTATTGCTGACCATATCCACAGCGTGGAGATGAAGCAG 

GGCCGTACTTTGCTGCACTGTGCTGCTGGTGTGAGCCGCTCAGCTGCCCTGTGCCTCGCCTACCTCA 

TGAAGTACCACGCCATGTCCCTGCTGGACGCCCACACGTGGACCAAGTCATGCCGGCCCATCATCC 

GACCCAACAGCGGCTTTTGGGAGCAGCTCATCCACTATGAGTTCCAATTGTTTGGCAAGAACACTGT 

GCACATGGTCAGTTCCCCAGTGGGAATGATCCCTGACATCTATGAGAAGGAAGTCCGTTTGATGATT 

CCACTGTGAGCCATCCCACGAGCC 



The nucleic acid sequence of NOV9 maps to chromosome 22 and has 363 of 563 bases 
(64%) identical to a gb:GENBANK-ID:AF120032|acc:AF120032.1 mRNA from Homo sapiens 
(Homo sapiens MAP kinase phosphatase 6 (MKP6) mRNA, complete cds). 

The NOV9 polypeptide (SEQ ID NO:30) is 188 amino acid residues in length and is 
presented using the one-letter amino acid code in Table 9B. The SignalP, Psort and/or 
Hydropathy results predict that NOV9 has a signal peptide and is likely to be localized to the 
cytoplasm with a certainty of 0.4500. In alternative embodiments, a NOV9 polypeptide is located 
to the microbody (peroxisome) with a certainty of 0.3000, the lysosome (lumen) with a certainty 
of 0.1955, or the mitochondrial matrix space with a certainty of 0.1000. 



Table 9B. Encoded NOV9 Protein Sequence (SEQ ID NO:30) 

MTAPSCAFPVQIRQPSVSGLSQITKSLYISNGVAANNKLMLSSNQITMVINVSVEVVNTLYEDIQYMQ 
WVADSPNSRLCDFFDPIADHIHSVEMKQGRTLLHCAAGVSRSAALCLAYLMKYHAMSLLDAHTWT 
KSCRPIIRPNSGFWEQLIHYEFQLFGKNTVHMVSSPVGMIPDIYEKEVRLMIPL 

The NOV9 amino acid sequence has 187 of 188 amino acid residues (99%) identical to, 
and 187 of 188 amino acid residues (99%) similar to, the 188 amino acid residue 
gil 1 7485 1 42 IrefjXP 03 848 1 .2 1 XM_038481 protein from Homo sapiens (Human) 
(HYPOTHETICAL PROTEIN XP_038481) (E = e" 102 ). 
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N0V9 is expressed in at least the following tissues: Brain, Brown adipose, Cartilage, 
Colon, Dermis, Epidermis, Hair Follicles, Hippocampus, Hypothalamus, Kidney, Lung, Lymph 
node, Lymphoid tissue, Ovary, Oviduct/Uterine Tube/Fallopian tube, Parotid Salivary glands, 
Peripheral Blood, Pituitary Gland, Prostate, Right Cerebellum, Skin, Substantia Nigra, Testis, 
Thyroid, Tonsils, Umbilical Vein, Uterus, Vulva, Whole Organism. Expression information was 
derived from the tissue sources of the sequences that were included in the derivation of the 
sequence of NOV9.The sequence is predicted to be expressed in the following tissues because of 
the expression pattern of (GENBANK-ID: gb:GENBANK-ID:AF120032|acc:AF120032.1) a 
closely related Homo sapiens MAP kinase phosphatase 6 (MKP6) mRNA, complete cds homolog 
in species Homo sapiens : breast and ovarian tissue, pancreas, brain, liver, kidney, spleen, testis, 
ovary, and peripheral blood leukocytes. 

NOV9 has homology to the amino acid sequences shown in the BLASTP data listed in 



Table 9C. 



Table 9C. BLAST results for NOV9 


Gene Index/ 
Identifier 


Protein/ 
Organism 


Length 
(aa) 


Identity 
(%) 


Positives 
(%) 


Expect 


gi I 17485142 ref XP 


hypothetical 
protein 
XP_038481 

[Homo 
sapiens] 


188 


187/188 
(99%) 


187/188 
(99%) 


e-102 


038481. 2[ 
(XMJD38481) 


g i [ 1 80 43 29 3^ jgb [AAH2 
~~0C36 .l] AAK2~003 6~~" 
(BC020036) 


Unknown 
(protein for 
MGC:28218) 

[Mus 
musculus] 


188 


156/188 
(82%) 


171/188 
(89%) 


4e-86 


gi 1 13273657 gb | AAHO 


Unknown 
(protein for 
IMAGE:3689593 
) [Homo 
sapiens] 


151 


148/148 
(100%) 


148/148 
(100%) 


2e-81 


4110,1 1 AAE04110 
(BC004110) 


gi; 22840422 dbj BAB 


putative [Mus 
musculus] 


189 


137/186 
(73%) 


158/186 
(84%) 


le-76 


24847.1 (AK007061) 


gi:i0334445_emb CAC 


bA386N14.1 
(novel 
protein 
similar to a 
dual 

specificity 
phosphatase) 
[Homo 
sapiens] 


190 


131/190 
(68%) 


164/190 
(85%) 


6e-72 


10195.1 (AL133545) 
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The homology of these sequences is shown graphically in the ClustalW analysis shown in 
Table 9D. 



Table 9D. ClustalW Analysis of NOV9 



DNOV9 (SEQ ID WO: 38) 

2) gj, 17485142 re£ XP 038481.2 (XM_038481) hypothetical protein XP_038481 [Homo 
sapiens}" (SEQ ID NO: 78) 

3) gi I 180432 9 3 , gb [ AAH20C36 . 1 , AAH20 C3S (BC020036) Unknown (protein for MGC.-28218) 
[Mus musculus] (SEQ ID NO: 79) 

4 > gx 1 1327 865 7 [gb[ AA H0 4 110.1 AAK041: 0 (BC004110) Unknown (protein for 
IMAGE : 3689593 ) [Homo sapiens] (SEQ ID NO: 80) 

5) gi; 12840 422 dbj . BAB24847 . 1 i (AK007061) putative [Mus musculus] (SEQ ID NO: 81) 

6) gi | 10334445 em b CAC1Q195.1} (AL133545) bA386N14.1 (novel protein similar to a 
dual specificity phosphatase) [Homo sapiens] (SEQ ID NO: 82) 



NOV9 

gi| 17485142 | 
gi | 18043293 | 
gi|l3278657| 
gi|l2840422 j 
gi|l0334445| 



10 



20 



...|... 
APSCA 
APSCA 



-V IR P- VS 
-V FR P- VS 
SPWSA -V IP P- IR 



TASCI -S AT QDNIY 
ASASS SSS GV QP IYSF 



30 
.|.. 
Y 
Y 
F 

F 

FL 



40 

..)., .|... 
N M 
N M 
N It 

RG 
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N H 
R 



50 
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M 
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A 





60 


70 




80 


90 




100 


NOV9 


....|....|....|. 

L 


...|.. 
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A 




s c 


....|.. 


• • 1 • 


...| 
K 


gi|l7485142| 


L 


M 


A 


s c 






K 


gi j 18043293 | 


A F 


V 


V 


A VA SN 


SV R 




QK 


gi|l3278657| 


L 


M 


A 


S C 






K 


gi j 12840422 j 


I A FF 


V 


S 


A Y Y 




G 


RN 


gijl0334445| 


IV A VFF G 


IK 


T 


ARD Y 


h 


TID 


R 




110 


120 




130 


140 




150 



NOV9 

gi 1 17485142 j 
gi|l8043293j 
gij 13278657) 
gi|l2840422j 
gi|l0334445 j 



T 
S 



■|....|....|. 



N T 
S 



N 
N 



NOV9 

gi] 17485142 | 
gi|l8043293 | 
gi j 13278657 j 
gi | 12840422 j 
gi|l0334445| 
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...|.... 
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...| 
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K SR 
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K NN 
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Tables 9E ? 9F and 9G list the domain description from DOMAIN analysis results against 
NOV9. This indicates that the NOV9 sequence has properties similar to those of other proteins 
known to contain these domains. 
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Table 9E. Domain Analysis of NOV9 

gnl Smart ■ smartOO 195 , DSPc, Dual specificity phosphatase, catalytic domain 

(SEQ ID NO: 83) 
CD-Length = 139 residues, 100.0% aligned 

Score = 134 bits (336), Expect = 6e-33 



NOV 9: 19 GLSQITKSLYISNGVAANNKLMLSSNOITMVINVSVEWNTLYEDIQYMQVPVADSPNSR 78 

I l + l 11+ + l + l +1 II I I I 1+ II 1+ 1+ HI 1+ + + 

Sbjct : 1 GP SE I LPHL YLGS YSDASNLALLKKLG I THV I NVTEE VPNSNKSGFL YLGI PVDDNTET K 60 



NOV 9: 79 LCDFFDPIADHIHSVEMKQGRTLLHCAAGVSRSAALCLAYLMKYHAMSLLDAHTWTKSCR 138 

+ 1 I'l 1+ 1 + 1 I i I I I I I I I +111111 III 11+ + I I 
Sbjct : 61 I SPYLPEAVEFIEDAEKKGGKVLVHCQAGVSRSATLI IAYLMKYRNMSLNDAYDFVKERR 120 



NOV 9: 139 PI IRPNSGFWEQLIHYEFQ 157 

ill II II III II + 
Sbjct: 121 PIISPNFGFLRQLIEYERK 13 9 



Table 9F. Domain Analysis of NOV9 

gnl,Pfamjp fam0Q782, DSPc, Dual specificity phosphatase, catalytic domain. 
Ser/Thr and Tyr protein phosphatases. The enzyme's tertiary fold is highly similar 
to that of tyrosine-specific phosphatases, except for a "recognition" region. 

(SEQ ID NO: 84) 
CD-Length =13 9 residues, 100.0% aligned 

Score = 134 bits (336), Expect = 6e-33 



NOV 9: 19 GLSQ I TKSL YI SNGVAANNKLMLS SNQ I TMVINVS VE WNTL YED 1 0 YMQVP VADS PNS R 78 

I 1 + 1 11+ + l + l II II IMI+ II 1+ 1+ +111+ + 

Sbjct : 1 GPSEILPHLYLGSYPTASNLAFLSKLGITHVINVTEEVPNSKNSGFLYLHIPVDDNHETD 60 



NOV 9: 79 LCDFFDPIADHIHSVEMKQGRTLLHCAAGVSRSAALCLAYLMKYHAMSLLDAHTWTKSCR 13 8 

+ + I +1 I 1+ l + l I I l + M I I I +1 I I I I +11 +I+ + + I I 

Sbjct : 61 I SPYLDEAVEFIEDARQKGGKVLVHCQAGISRSATLI IAYLMKTRNLSLNEAYSFVKERR 12 0 



NOV 9: 139 PI IRPNSGFWEQLIHYEFQ 157 

III II II III II + 
Sbjct: 121 P I I SPNFGFKRQLI EYERK 13 9 
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Table 9G. Domain Analysis of NOV9 

gill, Smart 'smartOOl 94 , PTPc, Protein tyrosine phosphatase, catalytic domain 

(SEQ ID NO: 85) 
CD-Length = 264 residues, 12.5% aligned 

Score = 35.0 bits (79), Expect = 0.004 

NOV 9: 88 DHIHSVEMKQGRTLLHCAAGVSRSAALCLAYLM 120 

! I ++II+III 1+ 

Sbjct: 187 RKSQS TLRNS GP I WHCS AGVGRTGTF I AI D I L 219 

Mitogen- activated protein (MAP) kinase phosphatases constitute a growing family of dual 
specificity phosphatases thought to play a role in the dephosphorylation and inactivation of MAP 
kinases and are therefore likely to be important in the regulation of diverse cellular processes such 
as proliferation, differentiation, and apoptosis. For this reason it has been suggested that MAP 
kinase phosphatases may be tumor suppressors. DUSP6 (alias PYST1), one of the dual-specificity 
tyrosine phosphatases, is localized on 12q21, one of the regions of frequent allelic loss in 
pancreatic cancer This gene is composed of three exons, and two forms of alternatively spliced 
transcripts are ubiquitously expressed. Although no mutations were observed in 26 pancreatic 
cancer cell lines, reduced expressions of the full-length transcripts were observed in some cell 
lines, which may suggest some role for DUSP6 in pancreatic carcinogenesis. PMID: 9858808 

The mito gen-induced gene, DUSP2, encodes a nuclear protein, PAC1, that acts as a dual- 
specific protein phosphatase with stringent substrate specificity for MAP kinase. MAP kinase 
phosphorylation and consequent enzymatic activation is a central and often obligatory component 
in signal transduction initiated by growth factor stimulation or resulting from various types of 
oncogenic transformation. DUSP2 downregulates intracellular signal transduction through the 
dephosphorylation/inactivation of MAP kinases. PMID: 7590752 

Keyse and Emslie (1992) isolated and characterized a cDNA, which they designated 
CL1 00, corresponding to an mRNA that is highly inducible by oxidative stress and heat shock in 
human skin cells. The cDNA was obtained by differential screening of a library made from 
normal human skin fibroblasts stressed for 2 hours in a solution of hydrogen peroxide. The cDNA 
contains an open reading frame specifying a 367-residue protein of 39.3 kD predicted molecular 
mass with the structural features of a nonreceptor type protein-tyrosine phosphatase. It has 
significant amino acid sequence similarity to a tyr/ser-protein phosphatase encoded by the late 
gene HI of vaccinia virus. The purified protein encoded by the open reading frame expressed in 
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bacteria has intrinsic phosphatase activity. Given the relationship between the levels of protein- 
tyrosine phosphorylation, receptor activity, cellular proliferation, and cell-cycle control, Keyse 
and Emslie (1992) concluded that induction of this gene may play an important regulatory role in 
the human cellular response to environmental stress. Alessi et al. (1993) found that the 
5 phosphatase encoded by CL100 has dual specificity for tyrosine and threonine and that it 
specifically inactivates mitogen-activated protein kinase in vitro. Brondello et al. (1999) 
determined that DUSP1, which they called MKP1, is a labile protein with a half-life of 
approximately 45 minutes in CCL39 hamster fibroblasts. Its degradation was attenuated by 
inhibitors of the ubiqui tin-directed proteasome complex. MKP1 was a target in vivo and in vitro 
ffl for p42MAPK ( 176948) or p44MAPK (601795), which phosphorylates MKP1 on 2 C-terminal 
Q serine residues, ser359 and ser364. This phosphorylation did not modify MKPTs intrinsic ability 
fjj to dephosphorylate p44MAPK, but led to stabilization of the protein. Brondello et al (1999) 
% concluded that these results illustrated the importance of regulated protein degradation in the 
63 control of mitogenic signaling. 

15 The protein similarity information, expression pattern, and map location for the NOV9 

L protein and nucleic acid disclosed herein suggest that it may have important structural and/or 
Jff physiological functions characteristic of the family. Therefore, the nucleic acids and proteins of 
H J the invention are useful in potential diagnostic and therapeutic applications and as a research tool. 

These include serving as a specific or selective nucleic acid or protein diagnostic and/or 
20 prognostic marker, wherein the presence or amount of the nucleic acid or the protein are to be 
assessed, as well as potential therapeutic applications such as the following: (i) a protein 
therapeutic, (ii) a small molecule drug target, (iii) an antibody target (therapeutic, diagnostic, drug 
targeting/cytotoxic antibody), (iv) a nucleic acid useful in gene therapy (gene delivery/gene 
ablation), and (v) a composition promoting tissue regeneration in vitro and in vivo (vi) biological 
25 defense weapon. 

The nucleic acids and proteins of the invention are useful in potential diagnostic and 
therapeutic applications implicated in various diseases and disorders described below and/or other 
pathologies. For example, the compositions of the present invention will have efficacy for 
treatment of patients suffering from: brain disorders including epilepsy, eating disorders, 
30 schizophrenia, ADD, and cancer; heart disease; blood disorders, kidney disorders, liver diseases, 
inflammation and autoimmune disorders including Crohn's disease, IBD, allergies, rheumatoid 
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and osteoarthritis, inflammatory skin disorders, allergies, blood disorders; psoriasis; colon-, 
ovarian-, testicular-, lymphatic-, brain-, and pancreatic cancers; leukemia AIDS; thalamus 
disorders; metabolic disorders including diabetes and obesity; lung diseases such as asthma, 
emphysema, cystic fibrosis, and cancer; pancreatic disorders including pancreatic insufficiency; 
and prostate disorders including prostate cancer and other diseases, disorders and conditions of 
the like. 

NOVX Nucleic Acids and Polypeptides 

One aspect of the invention pertains to isolated nucleic acid molecules that encode NOVX 
polypeptides or biologically active portions thereof. Also included in the invention are nucleic 
acid fragments sufficient for use as hybridization probes to identify NOVX-encoding nucleic 
acids (e.g., NOVX mRNAs) and fragments for use as PCR primers for the amplification and/or 
mutation of NOVX nucleic acid molecules. As used herein, the term "nucleic acid molecule" is 
intended to include DNA molecules (e.g., cDNA or genomic DNA), RNA molecules (e.g., 
mRNA), analogs of the DNA or RNA generated using nucleotide analogs, and derivatives, 
fragments and homologs thereof The nucleic acid molecule may be single-stranded or double- 
stranded, but preferably is comprised double-stranded DNA. 

An NOVX nucleic acid can encode a mature NOVX polypeptide. As used herein, a 
"mature" form of a polypeptide or protein disclosed in the present invention is the product of a 
naturally occurring polypeptide or precursor form or proprotein. The naturally occurring 
polypeptide, precursor or proprotein includes, by way of nonlimiting example, the full-length 
gene product, encoded by the corresponding gene. Alternatively, it may be defined as the 
polypeptide, precursor or proprotein encoded by an ORF described herein. The product "mature" 
form arises, again by way of nonlimiting example, as a result of one or more naturally occurring 
processing steps as they may take place within the cell, or host cell, in which the gene product 
arises. Examples of such processing steps leading to a "mature" form of a polypeptide or protein 
include the cleavage of the N-terminal methionine residue encoded by the initiation codon of an 
ORF, or the proteolytic cleavage of a signal peptide or leader sequence. Thus a mature form 
arising from a precursor polypeptide or protein that has residues 1 to N, where residue 1 is the N- 
terminal methionine, would have residues 2 through N remaining after removal of the N-terminal 
methionine. Alternatively, a mature form arising from a precursor polypeptide or protein having 
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residues 1 to N, in which an N-terminal signal sequence from residue 1 to residue M is cleaved, 
would have the residues from residue M+l to residue N remaining. Further as used herein, a 
"mature" form of a polypeptide or protein may arise from a step of post-translational modification 
other than a proteolytic cleavage event. Such additional processes include, by way of non- 
limiting example, glycosylation, myristoylation or phosphorylation. In general, a mature 
polypeptide or protein may result from the operation of only one of these processes, or a 
combination of any of them. 

The term "probes", as utilized herein, refers to nucleic acid sequences of variable length, 
preferably between at least about 10 nucleotides (nt), 100 nt, or as many as approximately, e.g., 
6,000 nt, depending upon the specific use. Probes are used in the detection of identical, similar, 
or complementary nucleic acid sequences. Longer length probes are generally obtained from a 
natural or recombinant source, are highly specific, and much slower to hybridize than shorter- 
length oligomer probes. Probes may be single- or double-stranded and designed to have 
specificity in PCR, membrane-based hybridization technologies, or ELISA-like technologies. 

The term "isolated" nucleic acid molecule, as utilized herein, is one, which is separated 
from other nucleic acid molecules which are present in the natural source of the nucleic acid. 
Preferably, an "isolated" nucleic acid is free of sequences which naturally flank the nucleic acid 
(i.e., sequences located at the 5 r - and 3 '-termini of the nucleic acid) in the genomic DNA of the 
organism from which the nucleic acid is derived. For example, in various embodiments, the 
isolated NOVX nucleic acid molecules can contain less than about 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 
kb or 0.1 kb of nucleotide sequences which naturally flank the nucleic acid molecule in genomic 
DNA of the cell/tissue from which the nucleic acid is derived (e.g., brain, heart, liver, spleen, 
etc.). Moreover, an "isolated" nucleic acid molecule, such as a cDNA molecule, can be 
substantially free of other cellular material or culture medium when produced by recombinant 
techniques, or of chemical precursors or other chemicals when chemically synthesized. 

A nucleic acid molecule of the invention, e.g., a nucleic acid molecule having the 
nucleotide sequence SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29, or a 
complement of this aforementioned nucleotide sequence, can be isolated using standard molecular 
biology techniques and the sequence information provided herein. Using all or a portion of the 
nucleic acid sequence of SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29 as a 
hybridization probe, NOVX molecules can be isolated using standard hybridization and cloning 



100 



techniques (e.g., as described in Sambrook, ei aL 9 (eds.), Molecular Cloning: A Laboratory 
Manual 2 nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989; and 
Ausubel, et ah, (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, New 
York, NY, 1993.) 

5 A nucleic acid of the invention can be amplified using cDNA, mRNA or alternatively, 

genomic DNA, as a template and appropriate oligonucleotide primers according to standard PCR 
amplification techniques. The nucleic acid so amplified can be cloned into an appropriate vector 
and characterized by DNA sequence analysis. Furthermore, oligonucleotides corresponding to 
NOVX nucleotide sequences can be prepared by standard synthetic techniques, e.g., using an 

S 5. 

flj automated DNA synthesizer. 

O As used herein, the term "oligonucleotide" refers to a series of linked nucleotide residues, 

01 

Sj which oligonucleotide has a sufficient number of nucleotide bases to be used in a PCR reaction. 
A short oligonucleotide sequence may be based on, or designed from, a genomic or cDNA 

-3 — 

03 sequence and is used to amplify, confirm, or reveal the presence of an identical, similar or 

|f complementary DNA or RNA in a particular cell or tissue. Oligonucleotides comprise portions of 
a nucleic acid sequence having about 10 nt, 50 nt, or 100 nt in length, preferably about 15 nt to 30 

S3 nt in length. In one embodiment of the invention, an oligonucleotide comprising a nucleic acid 

SI molecule less than 100 nt in length would further comprise at least 6 contiguous nucleotides SEQ 
ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29, or a complement thereof. 

20 Oligonucleotides may be chemically synthesized and may also be used as probes. 

In another embodiment, an isolated nucleic acid molecule of the invention comprises a 
nucleic acid molecule that is a complement of the nucleotide sequence shown in SEQ ID NOS:l, 
3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27 and 29, or aportion of this nucleotide sequence (e.g., a 
fragment that can be used as a probe or primer or a fragment encoding a biologically-active 

25 portion of an NOVX polypeptide). A nucleic acid molecule that is complementary to the 

nucleotide sequence shown NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 
39 or 41 is one that is sufficiently complementary to the nucleotide sequence shown NOS:l, 3, 5, 
7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39 or 41 that it can hydrogen bond with 
little or no mismatches to the nucleotide sequence shown SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 

30 1 7, 1 9, 21 , 23, 25, 27 and 29, thereby forming a stable duplex. 
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As used herein, the term "complementary" refers to Watson-Crick or Hoogsteen base 
pairing between nucleotides units of a nucleic acid molecule, and the term "binding" means the 
physical or chemical interaction between two polypeptides or compounds or associated 
polypeptides or compounds or combinations thereof Binding includes ionic, non-ionic, van der 
Waals, hydrophobic interactions, and the like. A physical interaction can be either direct or 
indirect. Indirect interactions may be through or due to the effects of another polypeptide or 
compound. Direct binding refers to interactions that do not take place through, or due to, the 
effect of another polypeptide or compound, but instead are without other substantial chemical 
intermediates. 

Fragments provided herein are defined as sequences of at least 6 (contiguous) nucleic 
acids or at least 4 (contiguous) amino acids, a length sufficient to allow for specific hybridization 
in the case of nucleic acids or for specific recognition of an epitope in the case of amino acids, 
respectively, and are at most some portion less than a full length sequence. Fragments may be 
derived from any contiguous portion of a nucleic acid or amino acid sequence of choice. 
Derivatives are nucleic acid sequences or amino acid sequences formed from the native 
compounds either directly or by modification or partial substitution. Analogs are nucleic acid 
sequences or amino acid sequences that have a structure similar to, but not identical to, the native 
compound but differs from it in respect to certain components or side chains. Analogs may be 
synthetic or from a different evolutionary origin and may have a similar or opposite metabolic 
activity compared to wild type. Homologs are nucleic acid sequences or amino acid sequences of 
a particular gene that are derived from different species. 

Derivatives and analogs may be full length or other than full length, if the derivative or 
analog contains a modified nucleic acid or amino acid, as described below. Derivatives or 
analogs of the nucleic acids or proteins of the invention include, but are not limited to, molecules 
comprising regions that are substantially homologous to the nucleic acids or proteins of the 
invention, in various embodiments, by at least about 70%, 80%, or 95% identity (with a preferred 
identity of 80-95%) over a nucleic acid or amino acid sequence of identical size or when 
compared to an aligned sequence in which the alignment is done by a computer homology 
program known in the art, or whose encoding nucleic acid is capable of hybridizing to the 
complement of a sequence encoding the aforementioned proteins under stringent, moderately 
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stringent, or low stringent conditions. See e.g. Ausubel, et aL 9 Current Protocols m 
Molecular Biology, John Wiley & Sons, New York, NY, 1993, and below. 

A "homologous nucleic acid sequence" or "homologous amino acid sequence," or 
variations thereof, refer to sequences characterized by a homology at the nucleotide level or 
amino acid level as discussed above. Homologous nucleotide sequences encode those sequences 
coding for iso forms of NOVX polypeptides. Isoforms can be expressed in different tissues of the 
same organism as a result of, for example, alternative splicing of RNA. Alternatively, isoforms 
can be encoded by different genes. In the invention, homologous nucleotide sequences include 
nucleotide sequences encoding for an NOVX polypeptide of species other than humans, 
including, but not limited to: vertebrates, and thus can include, e.g., frog, mouse, rat, rabbit, dog, 
cat cow, horse, and other organisms. Homologous nucleotide sequences also include, but are not 
limited to, naturally occurring allelic variations and mutations of the nucleotide sequences set 
forth herein. A homologous nucleotide sequence does not, however, include the exact nucleotide 
sequence encoding human NOVX protein. Homologous nucleic acid sequences include those 
nucleic acid sequences that encode conservative amino acid substitutions (see below) in SEQ ID 
NOS:L 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29, as well as a polypeptide possessing 
NOVX biological activity. Various biological activities of the NOVX proteins are described 
below. 

An NOVX polypeptide is encoded by the open reading frame ("ORF") of an NOVX 
nucleic acid. An ORF corresponds to a nucleotide sequence that could potentially be translated 
into a polypeptide. A stretch of nucleic acids comprising an ORF is uninterrupted by a stop 
codon. An ORF that represents the coding sequence for a Ml protein begins with an ATG "start" 
codon and terminates with one of the three "stop" codons, namely, TAA, TAG, or TGA. For the 
purposes of this invention, an ORF may be any part of a coding sequence, with or without a start 
codon, a stop codon, or both. For an ORF to be considered as a good candidate for coding for a 
bona fide cellular protein, a minimum size requirement is often set, e.g., & stretch of DNA that 
would encode a protein of 50 amino acids or more. 

The nucleotide sequences determined from the cloning of the human NOVX genes allows 
for the generation of probes and primers designed for use in identifying and/or cloning NOVX 
homologues in other cell types, e.g. from other tissues, as well as NOVX homologues from other 
vertebrates. The probe/primer typically comprises substantially purified oligonucleotide. The 
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oligonucleotide typically comprises a region of nucleotide sequence that hybridizes under 
stringent conditions to at least about 12, 25, 50, 100, 150, 200, 250, 300, 350 or 400 consecutive 
sense strand nucleotide sequence SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 
29; or an anti-sense strand nucleotide sequence of SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 
21, 23, 25, 27 and 29; or of a naturally occurring mutant of SEQ ID NOS.l, 3, 5, 7, 9, 1 1, 13, 15, 
17, 19, 21, 23, 25, 27 and 29. 

Probes based on the human NOVX nucleotide sequences can be used to detect transcripts 
or genomic sequences encoding the same or homologous proteins. In various embodiments, the 
probe further comprises a label group attached thereto, e.g. the label group can be a radioisotope, 
a fluorescent compound, an enzyme, or an enzyme co-factor. Such probes can be used as a part of 
a diagnostic test kit for identifying cells or tissues which mis-express an NOVX protein, such as 
by measuring a level of an NOVX-encoding nucleic acid in a sample of cells from a subject e.g., 
detecting NOVX mRNA levels or determining whether a genomic NOVX gene has been mutated 
or deleted. 

"A polypeptide having a biologically-active portion of an NOVX polypeptide" refers to 
polypeptides exhibiting activity similar, but not necessarily identical to, an activity of a 
polypeptide of the invention, including mature forms, as measured in a particular biological assay, 
with or without dose dependency. A nucleic acid fragment encoding a "biologically-active 
portion of NOVX" can be prepared by isolating a portion SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 
17, 19, 21, 23, 25, 27 and 29, that encodes a polypeptide having an NOVX biological activity (the 
biological activities of the NOVX proteins are described below), expressing the encoded portion 
of NOVX protein (e.g., by recombinant expression in vitro) and assessing the activity of the 
encoded portion of NOVX. 

NOVX Nucleic Acid and Polypeptide Variants 

The invention further encompasses nucleic acid molecules that differ from the nucleotide 
sequences shown in SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29due to 
degeneracy of the genetic code and thus encode the same NOVX proteins as that encoded by the 
nucleotide sequences shown in SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 
29. In another embodiment, an isolated nucleic acid molecule of the invention has a nucleotide 
sequence encoding a protein having an amino acid sequence shown in SEQ ID NOS:2, 4, 6, 8, 10, 
12, 14, 16, 18, 20, 22, 24, 26, 28 and 30. 
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In addition to the human NOVX nucleotide sequences shown in SEQ ID NOS:l, 3, 5, 7, 9, 
1 1 , 1 3, 1 5 , 1 7, 1 9, 2 1 , 23, 25, 27 and 29, it will be appreciated by those skilled in the art that DNA 
sequence polymorphisms that lead to changes in the amino acid sequences of the NOVX 
polypeptides may exist within a population (e.g., the human population). Such genetic 
5 polymorphism in the NOVX genes may exist among individuals within a population due to 

natural allelic variation. As used herein, the terms "gene" and "recombinant gene" refer to nucleic 
acid molecules comprising an open reading frame (ORF) encoding an NOVX protein, preferably 
a vertebrate NOVX protein. Such natural allelic variations can typically result in 1-5% variance 
in the nucleotide sequence of the NOVX genes. Any and all such nucleotide variations and 
ffl resulting amino acid polymorphisms in the NOVX polypeptides, which are the result of natural 
O allelic variation and that do not alter the functional activity of the NOVX polypeptides, are 
f\j intended to be within the scope of the invention. 

~i Moreover, nucleic acid molecules encoding NOVX proteins from other species, and thus 

03 that have a nucleotide sequence that differs from the human SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 
|| 17, 19, 21, 23, 25, 27 and 29are intended to be within the scope of the invention. Nucleic acid 

CSS? 

f 7 molecules corresponding to natural allelic variants and homologues of the NOVX cDNAs of the 

03 invention can be isolated based on their homology to the human NOVX nucleic acids disclosed 
herein using the human cDNAs, or a portion thereof, as a hybridization probe according to 
standard hybridization techniques under stringent hybridization conditions. 

20 Accordingly, in another embodiment, an isolated nucleic acid molecule of the invention is 

at least 6 nucleotides in length and hybridizes under stringent conditions to the nucleic acid 
molecule comprising the nucleotide sequence of SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
23, 25, 27 and 29. In another embodiment, the nucleic acid is at least 10, 25, 50, 100, 250, 500, 
750, 1000, 1500, or 2000 or more nucleotides in length. In yet another embodiment, an isolated 

25 nucleic acid molecule of the invention hybridizes to the coding region. As used herein, the term 
"hybridizes under stringent conditions" is intended to describe conditions for hybridization and 
washing under which nucleotide sequences at least 60% homologous to each other typically 
remain hybridized to each other. 

Homologs (i.e., nucleic acids encoding NOVX proteins derived from species other than 

30 human) or other related sequences {e.g., paralogs) can be obtained by low, moderate or high 
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stringency hybridization with all or a portion of the particular human sequence as a probe using 
methods well known in the art for nucleic acid hybridization and cloning. 

As used herein, the phrase "stringent hybridization conditions" refers to conditions under 
which a probe, primer or oligonucleotide will hybridize to its target sequence, but to no other 
5 sequences. Stringent conditions are sequence-dependent and will be different in different 
circumstances. Longer sequences hybridize specifically at higher temperatures than shorter 
sequences. Generally, stringent conditions are selected to be about 5 °C lower than the thermal 
melting point (Tm) for the specific sequence at a defined ionic strength and pH. The Tm is the 
temperature (under defined ionic strength, pH and nucleic acid concentration) at which 50% of the 
W probes complementary to the target sequence hybridize to the target sequence at equilibrium. 

Q Since the target sequences are generally present at excess, at Tm, 50% of the probes are occupied 

in 

fj j at equilibrium. Typically, stringent conditions will be those in which the salt concentration is less 

than about 1 .0 M sodium ion, typically about 0.01 to 1 .0 M sodium ion (or other salts) at pH 7.0 
M to 8.3 and the temperature is at least about 30°C for short probes, primers or oligonucleotides 
If (e.g., 10 nt to 50 nt) and at least about 60°C for longer probes, primers and oligonucleotides, 
f * Stringent conditions may also be achieved with the addition of destabilizing agents, such as 
Oj formamide. 

Stringent conditions are known to those skilled in the art and can be found in Ausubel, et 
a/., (eds.), Current Protocols in Molecular Biology, John Wiley & Sons, N.Y. (1989), 

20 6.3.1-6.3.6. Preferably, the conditions are such that sequences at least about 65%, 70%, 75%, 
85%, 90%o, 95%, 98%, or 99% homologous to each other typically remain hybridized to each 
other. A non-limiting example of stringent hybridization conditions are hybridization in a high 
salt buffer comprising 6X SSC, 50 mM Tris-HCl (pH 7.5), 1 mM EDTA, 0.02% PVP, 0.02% 
Ficoll, 0.02%> BSA, and 500 mg/ml denatured salmon sperm DNA at 65°C, followed by one or 

25 more washes in 0.2X SSC, 0.01% BSA at 50°C. An isolated nucleic acid molecule of the 

invention that hybridizes under stringent conditions to the sequences SEQ ID NOS:l, 3, 5, 7, 9, 
11, 13, 15, 17, 19, 21, 23, 25, 27 and 29, corresponds to a naturally-occurring nucleic acid 
molecule. As used herein, a "naturally-occurring" nucleic acid molecule refers to an RNA or 
DNA molecule having a nucleotide sequence that occurs in nature (e.g., encodes a natural 

30 protein). 
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In a second embodiment, a nucleic acid sequence that is hybridizable to the nucleic acid 
molecule comprising the nucleotide sequence ofSEQ IDNOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19,21, 
23, 25, 27 and 29, or fragments, analogs or derivatives thereof, under conditions of moderate 
stringency is provided. A non-limiting example of moderate stringency hybridization conditions 
5 are hybridization in 6X SSC, 5X Denhardt's solution, 0.5% SDS and 100 mg/ml denatured salmon 
sperm DNA at 55°C, followed by one or more washes in IX SSC, 0.1% SDS at 37°C. Other 
conditions of moderate stringency that may be used are well-known within the art. See, e.g., 
Ausubel, et al (eds.), 1993, Current Protocols in Molecular Biology, John Wiley & Sons, 
NY, and Kriegler, 1990; Gene Transfer and Expression, A Laboratory Manual, Stockton 
IQ Press, NY. 

In a third embodiment, a nucleic acid that is hybridizable to the nucleic acid molecule 
til comprising the nucleotide sequences SEQ ID NOS:l, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27 
m and 29, or fragments, analogs or derivatives thereof, under conditions of low stringency, is 
J: provided. A non- limiting example of low stringency hybridization conditions are hybridization in 
15 35% formamide, 5X SSC, 50 mM Tris-HCl (pH 7.5), 5 mM EDTA, 0.02% PVP, 0.02% Ficoll, 
H 0.2% BSA, 1 00 mg/ml denatured salmon sperm DNA, 1 0% (wt/vol) dextran sulfate at 40°C, 
£ followed by one or more washes in 2X SSC, 25 mM Tris-HCl (pH 7.4), 5 mM EDTA, and 0. 1 % 
□ SDS at 50°C. Other conditions of low stringency that may be used are well known in the art (e.g., 

as employed for cross-species hybridizations). See, e.g., Ausubel, et ah (eds.), 1993, CURRENT 
20 Protocols in Molecular Biology, John Wiley & Sons, NY, and Kriegler, 1990, Gene 

Transfer and Expression, A Laboratory Manual, Stockton Press, NY; Shilo and Weinberg, 

1981 . Proc Natl Acad Sci USA 78: 6789-6792. 

Conservative Mutations 

In addition to naturally-occurring allelic variants of NOVX sequences that may exist in the 
25 population, the skilled artisan will further appreciate that changes can be introduced by mutation 
into the nucleotide sequences SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29, 
thereby leading to changes in the amino acid sequences of the encoded NOVX proteins, without 
altering the functional ability of said NOVX proteins. For example, nucleotide substitutions 
leading to amino acid substitutions at "non-essential" amino acid residues can be made in the 
30 sequence SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30. A "non-essential" 
amino acid residue is a residue that can be altered from the wild-type sequences of the NOVX 
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proteins without altering their biological activity, whereas an "essential" amino acid residue is 
required for such biological activity. For example, amino acid residues that are conserved among 
the NOVX proteins of the invention are predicted to be particularly non-amenable to alteration. 
Amino acids for which conservative substitutions can be made are well-known within the art. 

Another aspect of the invention pertains to nucleic acid molecules encoding NOVX 
proteins that contain changes in amino acid residues that are not essential for activity. Such 
NOVX proteins differ in amino acid sequence from SEQ ID NOS:l, 3, 5, 7, 9, 11,13, 15, 17, 19, 
21, 23, 25, 27 and 29yet retain biological activity. In one embodiment, the isolated nucleic acid 
molecule comprises a nucleotide sequence encoding a protein, wherein the protein comprises an 
amino acid sequence at least about 45% homologous to the amino acid sequences SEQ ID NOS:2, 
4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30. Preferably, the protein encoded by the 
nucleic acid molecule is at least about 60% homologous to SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24, 26, 28 and 30; more preferably at least about 70% homologous SEQ ID NOS:2, 4, 
6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30; still more preferably at least about 80% 
homologous to SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30; even more 
preferably at least about 90% homologous to SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 
24, 26, 28 and 30; and most preferably at least about 95% homologous to SEQ ID NOS:2, 4, 6, 8, 
10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30. 

An isolated nucleic acid molecule encoding an NOVX protein homologous to the protein 
of SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30 can be created by 
introducing one or more nucleotide substitutions, additions or deletions into the nucleotide 
sequence ofSEQ ID NOS: 1,3, 5, 7, 9, 11, 13, 15, 17, 19,21,23, 25, 27 and 29, such that one or 
more amino acid substitutions, additions or deletions are introduced into the encoded protein. 

Mutations can be introduced into SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 
27 and 29by standard techniques, such as site-directed mutagenesis and PCR-mediated 
mutagenesis. Preferably, conservative amino acid substitutions are made at one or more 
predicted, non-essential amino acid residues. A "conservative amino acid substitution" is one in 
which the amino acid residue is replaced with an amino acid residue having a similar side chain. 
Families of amino acid residues having similar side chains have been defined within the art. 
These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic 
side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, 
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asparagme, glulamine, serine, threonine, tyrosine, cysteine), nonpolar side chains (e.g., alanine, 
valine, leucine, isoleucine, proline, phenylalanine, methionine, tryptophan), beta-branched side 
chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, 
tryptophan, histidine). Thus, a predicted non-essential amino acid residue in the NOVX protein is 
5 replaced with another amino acid residue from the same side chain family. Alternatively, in 

another embodiment, mutations can be introduced randomly along all or part of an NOVX coding 
sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for NOVX 
biological activity to identify mutants that retain activity. Following mutagenesis SEQ ID NOS:l, 
3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27 and 29, the encoded protein can be expressed by any 
t© recombinant technology known in the art and the activity of the protein can be determined. 
R The relatedness of amino acid families may also be determined based on side chain 

2 ! interactions. Substituted amino acids may be fully conserved "strong" residues or fully conserved 
yl "weak" residues. The "strong" group of conserved amino acid residues may be any one of the 
JJ following groups: STA, NEQK, NHQK, NDEQ, QHRK, MILV, MILF, HY, FYW, wherein the 
15 single letter amino acid codes are grouped by those amino acids that may be substituted for each 
j=* other. Likewise, the "weak" group of conserved residues may be any one of the following: CSA, 
K ATV, SAG, STNK, STPA, SGND, SNDEQK, NDEQHK, NEQHRK, VLIM, HFY, wherein the 

^ letters within each group represent the single letter amino acid code. 

in \ 

In one embodiment, a mutant NOVX protein can be assayed for (i) the ability to form 
20 proteinrprotein interactions with other NOVX proteins, other cell-surface proteins, or 

biologically-active portions thereof, (if) complex formation between a mutant NOVX protein and 
an NOVX ligand; or (Hi) the ability of a mutant NOVX protein to bind to an intracellular target 
protein or biologically-active portion thereof; (e.g. avidin proteins). 

In yet another embodiment, a mutant NOVX protein can be assayed for the ability to 
25 regulate a specific biological function (e.g., regulation of insulin release). 

Antisense Nucleic Acids 

Another aspect of the invention pertains to isolated antisense nucleic acid molecules that 
are hybridizable to or complementary to the nucleic acid molecule comprising the nucleotide 
sequence of SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29, or fragments, 
30 analogs or derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that 
is complementary to a "sense" nucleic acid encoding a protein (e.g., complementary to the coding 
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strand of a double-stranded cDNA molecule or complementary to an mRNA sequence). In 
specific aspects, antisense nucleic acid molecules are provided that comprise a sequence 
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an entire NOVX 
coding strand, or to only a portion thereof. Nucleic acid molecules encoding fragments, 
homologs, derivatives and analogs of an NOVX protein of SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24, 26, 28 and 30, or antisense nucleic acids complementary to an NOVX nucleic acid 
sequence of SEQ IDNOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29, are additionally 
provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding region" of 
the coding strand of a nucleotide sequence encoding an NOVX protein. The term "coding region" 
refers to the region of the nucleotide sequence comprising codons which are translated into amino 
acid residues. In another embodiment, the antisense nucleic acid molecule is antisense to a 
"noncoding region" of the coding strand of a nucleotide sequence encoding the NOVX protein. 
The term "noncoding region" refers to 5* and 3' sequences which flank the coding region that are 
not translated into amino acids (i.e., also referred to as 5' and 3' untranslated regions). 

Given the coding strand sequences encoding the NOVX protein disclosed herein, antisense 
nucleic acids of the invention can be designed according to the rules of Watson and Crick or 
Hoogsteen base pairing. The antisense nucleic acid molecule can be complementary to the entire 
coding region of NOVX mRNA, but more preferably is an oligonucleotide that is antisense to 
only a portion of the coding or noncoding region of NOVX mRNA. For example, the antisense 
oligonucleotide can be complementary to the region surrounding the translation start site of 
NOVX mRNA. An antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 
40, 45 or 50 nucleotides in length. An antisense nucleic acid of the invention can be constructed 
using chemical synthesis or enzymatic ligation reactions using procedures known in the art. For 
example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be chemically 
synthesized using naturally-occurring nucleotides or variously modified nucleotides designed to 
increase the biological stability of the molecules or to increase the physical stability of the duplex 
formed between the antisense and sense nucleic acids (e.g., phosphorothioate derivatives and 
acridine substituted nucleotides can be used). 

Examples of modified nucleotides that can be used to generate the antisense nucleic acid 
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine, 
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4- acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylammomethyl-24hiouridine, 

5- carboxymethylaminomethyluracil, dihydrouracil, beta-D-galactosylqueosine, inosine, 
N6-isopentenyl adenine, 1 -methylguanine, 1-methylinosine, 2, 2-dimethyl guanine, 
2-methyladenine, 2-methylguanine, 3 -methyl cyto sine, 5-methylcytosine, N6-adenine, 

5 7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-24hiouracil, 
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, 
2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, pseudouracil, 
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil, 
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-thiouracil, 
t® 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. Alternatively, the 
pj antisense nucleic acid can be produced biologically using an expression vector into which a 

* 1 nucleic acid has been subcloned in an antisense orientation (i.e., RNA transcribed from the 

iy 

CP inserted nucleic acid will be of an antisense orientation to a target nucleic acid of interest, 
K described further in the following subsection). 

15 The antisense nucleic acid molecules of the invention are typically administered to a 

U subject or generated in situ such that they hybridize with or bind to cellular mRNA and/or 
jta genomic DNA encoding an NOVX protein to thereby inhibit expression of the protein (e.g. , by 
C3 inhibiting transcription and/or translation). The hybridization can be by conventional nucleotide 
complementarity to form a stable duplex, or, for example, in the case of an antisense nucleic acid 
20 molecule that binds to DNA duplexes, through specific interactions in the major groove of the 

double helix. An example of a route of administration of antisense nucleic acid molecules of the 
invention includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules 
can be modified to target selected cells and then administered systemically. For example, for 
systemic administration, antisense molecules can be modified such that they specifically bind to 
25 receptors or antigens expressed on a selected cell surface (e.g., by linking the antisense nucleic 
acid molecules to peptides or antibodies that bind to cell surface receptors or antigens). The 
antisense nucleic acid molecules can also be delivered to cells using the vectors described herein. 
To achieve sufficient nucleic acid molecules, vector constructs in which the antisense nucleic acid 
molecule is placed under the control of a strong pol II or pol III promoter are preferred. 
30 In yet another embodiment, the antisense nucleic acid molecule of the invention is an 

oc-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms specific 
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double-stranded hybrids with complementary RNA in which, contrary to the usual |3-units, the 
strands run parallel to each other. See, e.g., Gaultier, et al, 1987. Nucl Acids Res. 15: 6625-6641. 
The antisense nucleic acid molecule can also comprise a 2-o-methylribonucleotide (See, e.g., 
Inoue, et al 1987. Nucl Acids Res. 15: 6131-6148) or a chimeric RNA-DNA analogue (See, e.g., 
Inoue, etal, 1987. FEBSLett. 215: 327-330. 

Ribozymes and PNA Moieties 

Nucleic acid modifications include, by way of non-limiting example, modified bases, and 
nucleic acids whose sugar phosphate backbones are modified or derivatized. These modifications 
are carried out at least in part to enhance the chemical stability of the modified nucleic acid, such 
that they may be used, for example, as antisense binding nucleic acids in therapeutic applications 
in a subject. 

In one embodiment, an antisense nucleic acid of the invention is a ribozyme. Ribozymes 
are catalytic RNA molecules with ribonuclease activity that are capable of cleaving a 
single-stranded nucleic acid, such as an mRNA, to which they have a complementary region. 
Thus, ribozymes (e.g., hammerhead ribozymes as described in Haselhoff and Gerlach 1988. 
Nature 334: 585-591) can be used to catalytically cleave NOVX mRNA transcripts to thereby 
inhibit translation of NOVX mRNA. A ribozyme having specificity for an NOVX-encoding 
nucleic acid can be designed based upon the nucleotide sequence of an NOVX cDNA disclosed 
herein (i.e., SEQ IDNOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29). For example, a 
derivative of a Tetrahymena L-19 IVS RNA can be constructed in which the nucleotide sequence 
of the active site is complementary to the nucleotide sequence to be cleaved in an 
NOVX-encoding mRNA. See, e.g., U.S. Patent 4,987,071 to Cech, et al and U.S. Patent 
5,1 16,742 to Cech, et al NOVX mRNA can also be used to select a catalytic RNA having a 
specific ribonuclease activity from a pool of RNA molecules. See, e.g., Battel et al, (1993) 
Science 26UUU-U1&. 

Alternatively, NOVX gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region of the NOVX nucleic acid (e.g., the NOVX promoter 
and/or enhancers) to form triple helical structures that prevent transcription of the NOVX gene in 
target cells. See, e.g., Helene, 1991. Anticancer Drug Des. 6: 569-84; Helene, et al 1992. Ann, 
NY. Acad. Sci. 660: 27-36; Maher, 1992. Bioassays 14: 807-15. 
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In various embodiments, the NOVX nucleic acids can be modified at the base moiety, 
sugar moiety or phosphate backbone to improve, e.g., the stability, hybridization, or solubility of 
the molecule. For example, the deoxyribose phosphate backbone of the nucleic acids can be 
modified to generate peptide nucleic acids. See, e.g., Hyrup, et ah, 1996. BioorgMed Chem 4: 
5 5-23. As used herein, the terms "peptide nucleic acids" or "PNAs" refer to nucleic acid mimics 
(e.g., DNA mimics) in which the deoxyribose phosphate backbone is replaced by a pseudopeptide 
backbone and only the four natural nucleobases are retained. The neutral backbone of PNAs has 
been shown to allow for specific hybridization to DNA and RNA under conditions of low ionic 
strength. The synthesis of PNA oligomers can be performed using standard solid phase peptide 
W synthesis protocols as described in Hyrup, et ah, 1996. supra; Perry-O'Keefe, et ah, 1996. Proc. 
q Natl Acad. Set USA 93: 14670-14675. 

| PNAs of NOVX can be used in therapeutic and diagnostic applications. For example, 

m PNAs can be used as antisense or antigene agents for sequence-specific modulation of gene 
E expression by, e.g., inducing transcription or translation arrest or inhibiting replication. PNAs of 
15 NOVX can also be used, for example, in the analysis of single base pair mutations in a gene (e.g., 
M= PNA directed PCR clamping; as artificial restriction enzymes when used in combination with 
^ other enzymes, e.g., Si nucleases (See, Hyrup, et ah, \996.supra); or as probes or primers for 
□ DNA sequence and hybridization (See, Hyrup, et al, 1996, supra; Perry-O'Keefe, et ah, 1996. 
supra). 

20 In another embodiment, PNAs of NOVX can be modified, e.g., to enhance their stability 

or cellular uptake, by attaching lipophilic or other helper groups to PNA, by the formation of 
PNA-DNA chimeras, or by the use of liposomes or other techniques of drug delivery known in 
the art. For example, PNA-DNA chimeras of NOVX can be generated that may combine the 
advantageous properties of PNA and DNA. Such chimeras allow DNA recognition enzymes 

25 (e.g., RNase H and DNA polymerases) to interact with the DNA portion while the PNA portion 
would provide high binding affinity and specificity. PNA-DNA chimeras can be linked using 
linkers of appropriate lengths selected in terms of base stacking, number of bonds between the 
nucleobases, and orientation (see, Hyrup, et al., 1996. supra). The synthesis of PNA-DNA 
chimeras can be performed as described in Hyrup, et ah, 1996. supra and Finn, et ah, 1996. Nucl 

30 Acids Res 24: 3357-3363. For example, a DNA chain can be synthesized on a solid support using 
standard phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g., 
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5'-(4-methoxytrityl)amino-5 -deoxy-thymidine phosphoramidite, can be used between the PNA 
and the 5' end of DNA. See, e.g., Mag, etal, 1989. Nucl Acid Res 17: 5973-5988. PNA 
monomers are then coupled in a stepwise manner to produce a chimeric molecule with a 5 f PNA 
segment and a 3' DNA segment. See, e.g., Finn, et al, 1996. supra. Alternatively, chimeric 
molecules can be synthesized with a 5' DNA segment and a 3' PNA segment. See, e.g., Petersen, 
et al, 1975. Bioorg Med. Chem. Lett 5: 1 1 19-1 1 124. 

In other embodiments, the oligonucleotide may include other appended groups such as 
peptides (e.g. , for targeting host cell receptors in vivo), or agents facilitating transport across the 
cell membrane (see, e.g., Letsinger, et aL, 1989. Proc. Natl. Acad. Set U.S.A. 86: 6553-6556; 
Lemaitre, et al, 1987. Proc. Natl Acad. Set 84: 648-652; PCT Publication No. WO88/09810) or 
the blood-brain barrier (see, e.g., PCT Publication No. WO 89/10134). In addition, 
oligonucleotides can be modified with hybridization triggered cleavage agents (see, e.g., Krol, et 
aL, 1988. BioTechniques 6:958-976) or intercalating agents (see, e.g., Zon, 1988. Pharm. Res. 5: 
539-549). To this end, the oligonucleotide may be conjugated to another molecule, e.g., & 
peptide, a hybridization triggered cross-linking agent, a transport agent, a hybridization-triggered 
cleavage agent, and the like. 

NOVX Polypeptides 

A polypeptide according to the invention includes a polypeptide including the amino acid 
sequence of NOVX polypeptides whose sequences are provided in SEQ ID NOS:2, 4, 6, 8, 10, 12, 
14, 16, 18, 20, 22, 24, 26, 28 and 30. The invention also includes a mutant or variant protein any 
of whose residues may be changed from the corresponding residues shown in SEQ ID NOS:2, 4, 
6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30 while still encoding a protein that maintains its 
NOVX activities and physiological functions, or a functional fragment thereof. 

In general, an NOVX variant that preserves NOVX-like function includes any variant in 
which residues at a particular position in the sequence have been substituted by other amino acids, 
and further include the possibility of inserting an additional residue or residues between two 
residues of the parent protein as well as the possibility of deleting one or more residues from the 
parent sequence. Any amino acid substitution, insertion, or deletion is encompassed by the 
invention. In favorable circumstances, the substitution is a conservative substitution as defined 
above. 
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One aspect of the invention pertains to isolated NOVX proteins, and biologically-active 
portions thereof, or derivatives, fragments, analogs or homologs thereof. Also provided are 
polypeptide fragments suitable for use as immunogens to raise anti-NOVX antibodies. In one 
embodiment, native NOVX proteins can be isolated from cells or tissue sources by an appropriate 
5 purification scheme using standard protein purification techniques. In another embodiment, 
NOVX proteins are produced by recombinant DNA techniques. Alternative to recombinant 
expression, an NOVX protein or polypeptide can be synthesized chemically using standard 
peptide synthesis techniques. 

An "isolated" or "purified" polypeptide or protein or biologically-active portion thereof is 
]M substantially free of cellular material or other contaminating proteins from the cell or tissue source 
Si from which the NOVX protein is derived, or substantially free from chemical precursors or other 

chemicals when chemically synthesized. The language "substantially free of cellular material" 
ffl includes preparations of NOVX proteins in which the protein is separated from cellular 
3" components of the cells from which it is isolated or recombinantly-produced. In one embodiment, 

i 5 the language "substantially free of cellular material" includes preparations of NOVX proteins 

£3 

lI having less than about 30% (by dry weight) of non-NOVX proteins (also referred to herein as a 
17 "contaminating protein"), more preferably less than about 20% of non-NOVX proteins, still more 
D preferably less than about 10% of non-NOVX proteins, and most preferably less than about 5% of 

non-NOVX proteins. When the NOVX protein or biologically-active portion thereof is 
20 recombinantly-produced, it is also preferably substantially free of culture medium, i.e., culture 
medium represents less than about 20%, more preferably less than about 10%, and most 
preferably less than about 5% of the volume of the NOVX protein preparation. 

The language "substantially free of chemical precursors or other chemicals" includes 
preparations of NOVX proteins in which the protein is separated from chemical precursors or 
25 other chemicals that are involved in the synthesis of the protein. In one embodiment, the 

language "substantially free of chemical precursors or other chemicals" includes preparations of 
NOVX proteins having less than about 30% (by dry weight) of chemical precursors or 
non-NOVX chemicals, more preferably less than about 20% chemical precursors or non-NOVX 
chemicals, still more preferably less than about 10% chemical precursors or non-NOVX 
30 chemicals, and most preferably less than about 5% chemical precursors or non-NOVX chemicals. 
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Biologically-active portions of NOVX proteins include peptides comprising amino acid 
sequences sufficiently homologous to or derived from the amino acid sequences of the NOVX 
proteins (e.g., the amino acid sequence shown in SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 
22, 24, 26, 28 and 30) that include fewer amino acids than the full-length NOVX proteins, and 
5 exhibit at least one activity of an NOVX protein. Typically, biologically- active portions comprise 
a domain or motif with at least one activity of the NOVX protein. A biologically-active portion 
of an NOVX protein can be a polypeptide which is, for example, 10, 25, 50, 100 or more amino 
acid residues in length. 

Moreover, other biologically- active portions, in which other regions of the protein are 
10 deleted, can be prepared by recombinant techniques and evaluated for one or more of the 

functional activities of a native NOVX protein, 
p In an embodiment, the NOVX protein has an amino acid sequence shown SEQ ID NOS:2, 

m 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30. In other embodiments, the NOVX protein is 
{? substantially homologous to SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30, 
45 and retains the functional activity of the protein of SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 
Zl 22, 24, 26, 28 and 30, yet differs in amino acid sequence due to natural allelic variation or 

mutagenesis, as described in detail, below. Accordingly, in another embodiment, the NOVX 
O protein is a protein that comprises an amino acid sequence at least about 45% homologous to the 
ly amino acid sequence SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 and 30, and 
20 retains the functional activity of the NOVX proteins of SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 18, 
20, 22, 24, 26, 28 and 30. 

Determining Homology Between Two or More Sequences 

To determine the percent homology of two amino acid sequences or of two nucleic acids, 
the sequences are aligned for optimal comparison purposes (e.g., gaps can be introduced in the 

25 sequence of a first amino acid or nucleic acid sequence for optimal alignment with a second 

amino or nucleic acid sequence). The amino acid residues or nucleotides at corresponding amino 
acid positions or nucleotide positions are then compared. When a position in the first sequence is 
occupied by the same amino acid residue or nucleotide as the corresponding position in the 
second sequence, then the molecules are homologous at that position (z.e., as used herein amino 

30 acid or nucleic acid "homology" is equivalent to amino acid or nucleic acid "identity"). 
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The nucleic acid sequence homology may be determined as the degree of identity between 
two sequences. The homology may be determined using computer programs known in the art, 
such as GAP software provided in the GCG program package. See, Needleman and Wunsch, 
1970. JMol Biol 48: 443-453. Using GCG GAP software with the following settings for nucleic 
5 acid sequence comparison: GAP creation penalty of 5.0 and GAP extension penalty of 0.3, the 
coding region of the analogous nucleic acid sequences referred to above exhibits a degree of 
identity preferably of at least 70%, 75%, 80%, 85%, 90%, 95%, 98%, or 99%, with the CDS 
(encoding) part of the DNA sequence shown in SEQ ID NOS.i, 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 
23, 25, 27 and 29. 

1=6 The term "sequence identity" refers to the degree to which two polynucleotide or 

polypeptide sequences are identical on a residue-by-residue basis over a particular region of 

p comparison. The term "percentage of sequence identity" is calculated by comparing two 

flj 

p optimally aligned sequences over that region of comparison, determining the number of positions 

% at which the identical nucleic acid base {e.g., A, T, C, G, U, or I, in the case of nucleic acids) 

4 5 occurs in both sequences to yield the number of matched positions, dividing the number of 

o 

tl matched positions by the total number of positions in the region of comparison (i.e., the window 
\Z size), and multiplying the result by 100 to yield the percentage of sequence identity. The term 
0 "substantial identity" as used herein denotes a characteristic of a polynucleotide sequence, 

wherein the polynucleotide comprises a sequence that has at least 80 percent sequence identity, 
20 preferably at least 85 percent identity and often 90 to 95 percent sequence identity, more usually 

at least 99 percent sequence identity as compared to a reference sequence over a comparison 

region. 

Chimeric and Fusion Proteins 

The invention also provides NOVX chimeric or fusion proteins. As used herein, an 
25 NOVX "chimeric protein" or "fusion protein" comprises an NOVX polypeptide operatively- 

linked to a non-NOVX polypeptide. An "NOVX polypeptide" refers to a polypeptide having an 
amino acid sequence corresponding to an NOVX protein SEQ ID NOS:2, 4, 6, 8, 10, 12, 14, 16, 
18, 20, 22, 24, 26, 28 and 30, whereas a "non-NOVX polypeptide" refers to a polypeptide having 
an amino acid sequence corresponding to a protein that is not substantially homologous to the 
30 NOVX protein, e.g., a protein that is different from the NOVX protein and that is derived from 
the same or a different organism. Within an NOVX fusion protein the NOVX polypeptide can 
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correspond to all or a portion of an NOVX protein. In one embodiment, an NOVX fusion protein 
comprises at least one biologically-active portion of an NOVX protein. In another embodiment, 
an NOVX fusion protein comprises at least two biologically-active portions of an NOVX protein. 
In yet another embodiment, an NOVX fusion protein comprises at least three biologically-active 
portions of an NOVX protein. Within the fusion protein, the term "operatively-linked" is 
intended to indicate that the NOVX polypeptide and the non-NOVX polypeptide are fused 
in- frame with one another. The non-NOVX polypeptide can be fused to the N-terminus or 
C-terminus of the NOVX polypeptide. 

In one embodiment, the fusion protein is a GST-NO VX fusion protein in which the 
NOVX sequences are fused to the C-terminus of the GST (glutathione S-transferase) sequences. 
Such fusion proteins can facilitate the purification of recombinant NOVX polypeptides. 

In another embodiment, the fusion protein is an NOVX protein containing a heterologous 
signal sequence at its N-terminus. In certain host cells (e.g., mammalian host cells), expression 
and/or secretion of NOVX can be increased through use of a heterologous signal sequence. 

In yet another embodiment, the fusion protein is an NO VX-immuno globulin fusion 
protein in which the NOVX sequences are fused to sequences derived from a member of the 
immunoglobulin protein family. The NOVX-immunoglobulin fusion proteins of the invention 
can be incorporated into pharmaceutical compositions and administered to a subject to inhibit an 
interaction between an NOVX ligand and an NOVX protein on the surface of a cell, to thereby 
suppress NOVX-mediated signal transduction in vivo. The NOVX-immunoglobulin fusion 
proteins can be used to affect the bioavailability of an NOVX cognate ligand. Inhibition of the 
NOVX ligand/NOVX interaction may be useful therapeutically for both the treatment of 
proliferative and differentiative disorders, as well as modulating (e.g. promoting or inhibiting) cell 
survival. Moreover, the NOVX-immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce anti-NOVX antibodies in a subject, to purify NOVX ligands, and in 
screening assays to identify molecules that inhibit the interaction of NOVX with an NOVX 
ligand. 

An NOVX chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different polypeptide 
sequences are ligated together in- frame in accordance with conventional techniques, e.g., by 
employing blunt-ended or stagger-ended termini for ligation, restriction enzyme digestion to 



118 



provide for appropriate termini, filling-in of cohesive ends as appropriate, alkaline phosphatase 
treatment to avoid undesirable joining, and enzymatic ligation. In another embodiment, the fusion 
gene can be synthesized by conventional techniques including automated DNA synthesizers. 
Alternatively, PCR amplification of gene fragments can be carried out using anchor primers that 
5 give rise to complementary overhangs between two consecutive gene fragments that can 
subsequently be annealed and reamplified to generate a chimeric gene sequence (see, e.g., 
Ausubel, et al (eds.) Current Protocols in Molecular Biology, John Wiley & Sons, 1992). 
Moreover, many expression vectors are commercially available that already encode a fusion 
moiety (e.g., a GST polypeptide). An NOVX-encoding nucleic acid can be cloned into such an 

f© expression vector such that the fusion moiety is linked in-frame to the NOVX protein. 

O 

y NOVX Agonists and Antagonists 

*2i The invention also pertains to variants of the NOVX proteins that function as either 

NOVX agonists (i.e., mimetics) or as NOVX antagonists. Variants of the NOVX protein can be 
generated by mutagenesis (e.g., discrete point mutation or truncation of the NOVX protein). An 

P agonist of the NOVX protein can retain substantially the same, or a subset of, the biological 

p_.it. 

hk activities of the naturally occurring form of the NOVX protein. An antagonist of the NOVX 
protein can inhibit one or more of the activities of the naturally occurring form of the NOVX 

W protein by, for example, competitively binding to a downstream or upstream member of a cellular 
signaling cascade which includes the NOVX protein. Thus, specific biological effects can be 

20 elicited by treatment with a variant of limited function. In one embodiment, treatment of a subject 
with a variant having a subset of the biological activities of the naturally occurring form of the 
protein has fewer side effects in a subject relative to treatment with the naturally occurring form 
of the NOVX proteins. 

Variants of the NOVX proteins that function as either NOVX agonists (i.e., mimetics) or 

25 as NOVX antagonists can be identified by screening combinatorial libraries of mutants (e.g., 
truncation mutants) of the NOVX proteins for NOVX protein agonist or antagonist activity. In 
one embodiment, a variegated library of NOVX variants is generated by combinatorial 
mutagenesis at the nucleic acid level and is encoded by a variegated gene library. A variegated 
library of NOVX variants can be produced by, for example, enzymatically ligating a mixture of 

30 synthetic oligonucleotides into gene sequences such that a degenerate set of potential NOVX 
sequences is expressible as individual polypeptides, or alternatively, as a set of larger fusion 
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proteins (e.g., for phage display) containing the set of NOVX sequences therein. There are a 
variety of methods which can be used to produce libraries of potential NOVX variants from a 
degenerate oligonucleotide sequence. Chemical synthesis of a degenerate gene sequence can be 
performed in an automatic DNA synthesizer, and the synthetic gene then ligated into an 
appropriate expression vector. Use of a degenerate set of genes allows for the provision, in one 
mixture, of all of the sequences encoding the desired set of potential NOVX sequences. Methods 
for synthesizing degenerate oligonucleotides are well-known within the art. See, e.g., Narang, 
1983. Tetrahedron 39: 3; Itakura, et aL, 1984. Annu. Rev. Biochem. 53: 323; Itakura, et aL 9 1984. 
Science 198: 1056; Ike, et aL, 1983. NucL Acids Res. 11: 477. 

Polypeptide Libraries 

In addition, libraries of fragments of the NOVX protein coding sequences can be used to 
generate a variegated population of NOVX fragments for screening and subsequent selection of 
variants of an NOVX protein. In one embodiment, a library of coding sequence fragments can be 
generated by treating a double stranded PCR fragment of an NOVX coding sequence with a 
nuclease under conditions wherein nicking occurs only about once per molecule, denaturing the 
double stranded DNA, renaturing the DNA to form double-stranded DNA that can include 
sense/antisense pairs from different nicked products, removing single stranded portions from 
reformed duplexes by treatment with Si nuclease, and ligating the resulting fragment library into 
an expression vector. By this method, expression libraries can be derived which encodes 
N-terminal and internal fragments of various sizes of the NOVX proteins. 

Various techniques are known in the art for screening gene products of combinatorial 
libraries made by point mutations or truncation, and for screening cDNA libraries for gene 
products having a selected property. Such techniques are adaptable for rapid screening of the 
gene libraries generated by the combinatorial mutagenesis of NOVX proteins. The most widely 
used techniques, which are amenable to high throughput analysis, for screening large gene 
libraries typically include cloning the gene library into replicable expression vectors, transforming 
appropriate cells with the resulting library of vectors, and expressing the combinatorial genes 
under conditions in which detection of a desired activity facilitates isolation of the vector 
encoding the gene whose product was detected. Recursive ensemble mutagenesis (REM), a new 
technique that enhances the frequency of functional mutants in the libraries, can be used in 
combination with the screening assays to identify NOVX variants. See, e.g., Arkin and Yourvan, 
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1992. Proa Natl. Acad Sci. USA 89: 781 1-7815; Delgrave, et al, 1993. Protein Engineering 
6:327-331. 

Anti-NOVX Antibodies 

Also included in the invention are antibodies to NOVX proteins, or fragments of NOVX 
5 proteins. The term "antibody" as used herein refers to immunoglobulin molecules and 

immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules that contain 
an antigen binding site that specifically binds (immunoreacts with) an antigen. Such antibodies 
include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, F a b, F a b> and F( a t>')2 
fragments, and an F a b expression library. In general, an antibody molecule obtained from humans 
|| relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ from one another by the 
m nature of the heavy chain present in the molecule. Certain classes have subclasses as well, such as 
^ IgGi, IgG2> and others. Furthermore, in humans, the light chain may be a kappa chain or a lambda 
£ chain. Reference herein to antibodies includes a reference to all such classes, subclasses and 

types of human antibody species, 
fe An isolated NOVX-related protein of the invention may be intended to serve as an antigen, 

M, or a portion or fragment thereof, and additionally can be used as an immunogen to generate 
~ antibodies that immunospecifically bind the antigen, using standard techniques for polyclonal and 
PJ monoclonal antibody preparation. The full-length protein can be used or, alternatively, the 
invention provides antigenic peptide fragments of the antigen for use as immunogens. An 
20 antigenic peptide fragment comprises at least 6 amino acid residues of the amino acid sequence of 
the full length protein and encompasses an epitope thereof such that an antibody raised against the 
peptide forms a specific immune complex with the full length protein or with any fragment that 
contains the epitope. Preferably, the antigenic peptide comprises at least 10 amino acid residues, 
or at least 15 amino acid residues, or at least 20 amino acid residues, or at least 30 amino acid 
25 residues. Preferred epitopes encompassed by the antigenic peptide are regions of the protein that 
are located on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of NOVX-related protein that is located on the surface of the protein, 
e.g., a hydrophilic region. A hydrophobicity analysis of the human NOVX-related protein 
30 sequence will indicate which regions of a NOVX-related protein are particularly hydrophilic and, 
therefore, are likely to encode surface residues useful for targeting antibody production. As a 
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means for targeting antibody production, hydropathy plots showing regions of hydrophilicity and 
hydrophobicity may be generated by any method well known in the art, including, for example, 
the Kyte Doolittle or the Hopp Woods methods, either with or without Fourier transformation. 
See, e.g,, Hopp and Woods, 1981, Proc. Nat Acad. Scl USA 78: 3824-3828; Kyte and Doolittle 
1982, J. Mol BioL 157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or derivatives, 
fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog thereof, 
may be utilized as an immunogen in the generation of antibodies that immunospecifically bind 
these protein components. 

Various procedures known within the art may be used for the production of polyclonal or 
monoclonal antibodies directed against a protein of the invention, or against derivatives, 
fragments, analogs homologs or orthologs thereof (see, for example, Antibodies: A Laboratory 
Manual, Harlow and Lane, 1988, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 
incorporated herein by reference). Some of these antibodies are discussed below. 

Polyclonal Antibodies 

For the production of polyclonal antibodies, various suitable host animals (e.g., rabbit, 
goat, mouse or other mammal) may be immunized by one or more injections with the native 
protein, a synthetic variant thereof, or a derivative of the foregoing. An appropriate immunogenic 
preparation can contain, for example, the naturally occurring immunogenic protein, a chemically 
synthesized polypeptide representing the immunogenic protein, or a recombinantly expressed 
immunogenic protein. Furthermore, the protein may be conjugated to a second protein known to 
be immunogenic in the mammal being immunized. Examples of such immunogenic proteins 
include but are not limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, 
and soybean trypsin inhibitor. The preparation can further include an adjuvant. Various adjuvants 
used to increase the immunological response include, but are not limited to, Freund's (complete 
and incomplete), mineral gels (e.g., aluminum hydroxide), surface active substances (e.g., 
lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, dinitrophenol, etc.), adjuvants 
usable in humans such as Bacille Calmette-Guerin and Corynebacterium parvum, or similar 
immunostimulatory agents. Additional examples of adjuvants which can be employed include 
MPL-TDM adjuvant (monophosphoryl Lipid A, synthetic trehalose dicorynomycolate). 
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The polyclonal antibody molecules directed against the immunogenic protein can be 
isolated from the mammal (e.g., from the blood) and further purified by well known techniques, 
such as affinity chromatography using protein A or protein G, which provide primarily the IgG 
fraction of immune serum. Subsequently, or alternatively, the specific antigen which is the target 
of the immunoglobulin sought, or an epitope thereof, may be immobilized on a column to purify 
the immune specific antibody by immunoaffmity chromatography. Purification of 
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist, published by The 
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28). 

Monoclonal Antibodies 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as used 
herein, refers to a population of antibody molecules that contain only one molecular species of 
antibody molecule consisting of a unique light chain gene product and a unique heavy chain gene 
product. In particular, the complementarity determining regions (CDRs) of the monoclonal 
antibody are identical in all the molecules of the population. MAbs thus contain an antigen 
binding site capable of immunoreacting with a particular epitope of the antigen characterized by a 
unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those described 
by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a mouse, hamster, or 
other appropriate host animal, is typically immunized with an immunizing agent to elicit 
lymphocytes that produce or are capable of producing antibodies that will specifically bind to the 
immunizing agent. Alternatively, the lymphocytes can be immunized in vitro. 

The immunizing agent will typically include the protein antigen, a fragment thereof or a 
fusion protein thereof. Generally, either peripheral blood lymphocytes are used if cells of human 
origin are desired, or spleen cells or lymph node cells are used if non-human mammalian sources 
are desired. The lymphocytes are then fused with an immortalized cell line using a suitable fusing 
agent, such as polyethylene glycol, to form a hybridoma cell (Goding, MONOCLONAL 
ANTIBODIES: Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell 
lines are usually transformed mammalian cells, particularly myeloma cells of rodent, bovine and 
human origin. Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can 
be cultured in a suitable culture medium that preferably contains one or more substances that 
inhibit the growth or survival of the unfused, immortalized cells. For example, if the parental 
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cells lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the 
culture medium for the hybridomas typically will include hypoxanthine, aminopterin, and 
thymidine ("HAT medium"), which substances prevent the growth of HGPRT-deficient cells. 
Preferred immortalized cell lines are those that fuse efficiently, support stable high level 
5 expression of antibody by the selected antibody-producing cells, and are sensitive to a medium 
such as HAT medium. More preferred immortalized cell lines are murine myeloma lines, which 
can be obtained, for instance, from the Salk Institute Cell Distribution Center, San Diego, 
California and the American Type Culture Collection, Manassas, Virginia. Human myeloma and 
mouse-human heteromyeloma cell lines also have been described for the production of human 

frO monoclonal antibodies (Kozbor, J. Immunol, 133:3001 (1984); Brodeur et ah, MONOCLONAL 

o 

p Antibody Production Techniques and Applications, Marcel Dekker, Inc., New York, (1987) 
jjj pp. 51-63). 

Bl The culture medium in which the hybridoma cells are cultured can then be assayed for the 

g presence of monoclonal antibodies directed against the antigen. Preferably, the binding 
%5 specificity of monoclonal antibodies produced by the hybridoma cells is determined by 
M immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay (RIA) or 
fn enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are known in the 
art. The binding affinity of the monoclonal antibody can, for example, be determined by the 
Scatchard analysis of Munson and Pollard, Anal Biochem., 107:220 (1980). Preferably, 
20 antibodies having a high degree of specificity and a high binding affinity for the target antigen are 
isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by limiting 
dilution procedures and grown by standard methods. Suitable culture media for this purpose 
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium. 

25 Alternatively, the hybridoma cells can be grown in vivo as ascites in a mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified from the 
culture medium or ascites fluid by conventional immunoglobulin purification procedures such as, 
for example, protein A-Sepharose, hydroxylapatite chromatography, gel electrophoresis, dialysis, 
or affinity chromatography. 

30 The monoclonal antibodies can also be made by recombinant DNA methods, such as those 

described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal antibodies of the 
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invention can be readily isolated and sequenced using conventional procedures (e.g., by using 
oligonucleotide probes that are capable of binding specifically to genes encoding the heavy and 
light chains of murine antibodies). The hybridoma cells of the invention serve as a preferred 
source of such DNA. Once isolated, the DNA can be placed into expression vectors, which are 
5 then transfected into host cells such as simian COS cells, Chinese hamster ovary (CHO) cells, or 
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain the synthesis of 
monoclonal antibodies in the recombinant host cells. The DNA also can be modified, for 
example, by substituting the coding sequence for human heavy and light chain constant domains 
in place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 368, 
K> 812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all or part of the 
□ coding sequence for a non-immunoglobulin polypeptide. Such a non-immunoglobulin 
~ t polypeptide can be substituted for the constant domains of an antibody of the invention, or can be 
CP substituted for the variable domains of one antigen-combining site of an antibody of the invention 
JS to create a chimeric bivalent antibody. 

|S Humanized Antibodies 

TKKT 

hk The antibodies directed against the protein antigens of the invention can further comprise 

ti humanized antibodies or human antibodies. These antibodies are suitable for administration to 
FU humans without engendering an immune response by the human against the administered 

immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins, 
20 immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab ! )2 or other antigen- 
binding subsequences of antibodies) that are principally comprised of the sequence of a human 
immunoglobulin, and contain minimal sequence derived from a non-human immunoglobulin. 
Humanization can be performed following the method of Winter and co-workers (Jones et al., 
Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988); Verhoeyen et al., 
25 Science, 239: 1 534-1536 (1988)), by substituting rodent CDRs or CDR sequences for the 

corresponding sequences of a human antibody. (See also U.S. Patent No. 5,225,539.) In some 
instances, Fv framework residues of the human immunoglobulin are replaced by corresponding 
non-human residues. Humanized antibodies can also comprise residues which are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, the 
30 humanized antibody will comprise substantially all of at least one, and typically two, variable 

domains, in which all or substantially all of the CDR regions correspond to those of a non-human 
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immunoglobulin and all or substantially all of the framework regions are those of a human 
immunoglobulin consensus sequence. The humanized antibody optimally also will comprise at 
least a portion of an immunoglobulin constant region (Fc), typically that of a human 
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr. Op. Struct. Biol, 
5 2:593-596(1992)). 

Human Antibodies 

Fully human antibodies relate to antibody molecules in which essentially the entire 
sequences of both the light chain and the heavy chain, including the CDRs, arise from human 
y : genes. Such antibodies are termed "human antibodies", or "fully human antibodies" herein. 
W Human monoclonal antibodies can be prepared by the trioma technique; the human B-cell 
yl hybridoma technique (see Kozbor, et al., 1983 Immunol Today 4: 72) and the EBV hybridoma 
if. technique to produce human monoclonal antibodies (see Cole, et al, 1985 In: Monoclonal 
S Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Human monoclonal antibodies 

yj 

may be utilized in the practice of the present invention and may be produced by using human 

|S hybridomas (see Cote, et al, 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by transforming 
human B-cells with Epstein Barr Virus in vitro (see Cole, et al., 1985 In: Monoclonal 

G Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, including 
phage display libraries (Hoogenboom and Winter, J. Mol Biol, 227:381 (1991); Marks et al., J. 

20 Mol Biol, lll'.^l (1991)). Similarly, human antibodies can be made by introducing human 
immunoglobulin loci into transgenic animals, e.g., mice in which the endogenous 
immunoglobulin genes have been partially or completely inactivated. Upon challenge, human 
antibody production is observed, which closely resembles that seen in humans in all respects, 
including gene rearrangement, assembly, and antibody repertoire. This approach is described, for 

25 example, in U.S. Patent Nos. 5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016, 
and in Marks et al. {Bio/Technology 10, 779-783 (1992)); Lonberg et al. {Nature 368 856-859 
(1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild et ^{Nature Biotechnology 14, 845- 
51 (1996)); Neuberger {Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar {Intern. 
Rev. Immunol 13 65-93 (1995)). 

30 Human antibodies may additionally be produced using transgenic nonhuman animals 

which are modified so as to produce fully human antibodies rather than the animal's endogenous 
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antibodies in response to challenge by an antigen. (See PCT publication WO94/02602). The 
endogenous genes encoding the heavy and light immunoglobulin chains in the nonhuman host 
have been incapacitated, and active loci encoding human heavy and light chain immunoglobulins 
are inserted into the host's genome. The human genes are incorporated, for example, using yeast 
5 artificial chromosomes containing the requisite human DNA segments. An animal which 

provides all the desired modifications is then obtained as progeny by crossbreeding intermediate 
transgenic animals containing fewer than the full complement of the modifications. The preferred 
embodiment of such a nonhuman animal is a mouse, and is termed the Xenomouse™ as disclosed 
in PCT publications WO 96/33735 and WO 96/34096. This animal produces B cells which 

1=0 secrete fully human immunoglobulins. The antibodies can be obtained directly from the animal 
after immunization with an immunogen of interest, as, for example, a preparation of a polyclonal 

Ul antibody, or alternatively from immortalized B cells derived from the animal, such as hybridomas 

ffi producing monoclonal antibodies. Additionally, the genes encoding the immunoglobulins with 

J; human variable regions can be recovered and expressed to obtain the antibodies directly, or can be 

i5 further modified to obtain analogs of antibodies such as, for example, single chain Fv molecules. 

Zl An example of a method of producing a nonhuman host, exemplified as a mouse, lacking 

expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. Patent No. 

f i 5,939,598. It can be obtained by a method including deleting the J segment genes from at least 
one endogenous heavy chain locus in an embryonic stem cell to prevent rearrangement of the 

20 locus and to prevent formation of a transcript of a rearranged immunoglobulin heavy chain locus, 
the deletion being effected by a targeting vector containing a gene encoding a selectable marker; 
and producing from the embryonic stem cell a transgenic mouse whose somatic and germ cells 
contain the gene encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is disclosed in 

25 U.S. Patent No. 5,91 6,771 . It includes introducing an expression vector that contains a nucleotide 
sequence encoding a heavy chain into one mammalian host cell in culture, introducing an 
expression vector containing a nucleotide sequence encoding a light chain into another 
mammalian host cell, and fusing the two cells to form a hybrid cell. The hybrid cell expresses an 
antibody containing the heavy chain and the light chain. 

30 In a further improvement on this procedure, a method for identifying a clinically relevant 

epitope on an immunogen, and a correlative method for selecting an antibody that binds 
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immunospecifically to the relevant epitope with high affinity, are disclosed in PCT publication 
WO 99/53049. 

F ab Fragments and Single Chain Antibodies 

According to the invention, techniques can be adapted for the production of single-chain 
antibodies specific to an antigenic protein of the invention (see e.g., U.S. Patent No. 4,946,778). 
In addition, methods can be adapted for the construction of F ab expression libraries (see e.g., Huse, 
et al., 1989 Science 246: 1275-1281) to allow rapid and effective identification of monoclonal F ab 
fragments with the desired specificity for a protein or derivatives, fragments, analogs or homologs 
thereof. Antibody fragments that contain the idiotypes to a protein antigen may be produced by 
techniques known in the art including, but not limited to: (i) an F (ab ' )2 fragment produced by 
pepsin digestion of an antibody molecule; (ii) an F ab fragment generated by reducing the disulfide 
bridges of an F (a y)2 fragment; (iii) an F ab fragment generated by the treatment of the antibody 
molecule with papain and a reducing agent and (iv) F v fragments. 

Bispecific Antibodies 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies that 
have binding specificities for at least two different antigens. In the present case, one of the 
binding specificities is for an antigenic protein of the invention. The second binding target is any 
other antigen, and advantageously is a cell-surface protein or receptor or receptor subunit. 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have different 
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of the random 
assortment of immunoglobulin heavy and light chains, these hybridomas (quadromas) produce a 
potential mixture often different antibody molecules, of which only one has the correct bispecific 
structure. The purification of the correct molecule is usually accomplished by affinity 
chromatography steps. Similar procedures are disclosed in WO 93/08829, published 13 May 
1993, and in Traunecker et al, 1991 EMBOJ., 10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody-antigen 
combining sites) can be fused to immunoglobulin constant domain sequences. The fusion 
preferably is with an immunoglobulin heavy-chain constant domain, comprising at least part of 
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the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain constant region 
(CHI) containing the site necessary for light-chain binding present in at least one of the fusions. 
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the immunoglobulin 
light chain, are inserted into separate expression vectors, and are co-transfected into a suitable 
host organism. For further details of generating bispecific antibodies see, for example, Suresh et 
al, Methods in Enzymology, 121:210 (1986). 

According to another approach described in WO 96/2701 1, the interface between a pair of 
antibody molecules can be engineered to maximize the percentage of heterodimers which are 
recovered from recombinant cell culture. The preferred interface comprises at least a part of the 
CH3 region of an antibody constant domain. In this method, one or more small amino acid side 
chains from the interface of the first antibody molecule are replaced with larger side chains (e.g. 
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size to the large side 
chain(s) are created on the interface of the second antibody molecule by replacing large amino 
acid side chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody fragments (e.g. 
F(ab')2 bispecific antibodies). Techniques for generating bispecific antibodies from antibody 
fragments have been described in the literature. For example, bispecific antibodies can be 
prepared using chemical linkage. Brennan et al., Science 229:81 (1985) describe a procedure 
wherein intact antibodies are proteolytically cleaved to generate F(ab') 2 fragments. These 
fragments are reduced in the presence of the dithiol complexing agent sodium arsenite to stabilize 
vicinal dithiols and prevent intermolecular disulfide formation. The Fab' fragments generated are 
then converted to thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB derivatives is then 
reconverted to the Fab '-thiol by reduction with mercaptoethylamine and is mixed with an 
equimolar amount of the other Fab'-TNB derivative to form the bispecific antibody. The 
bispecific antibodies produced can be used as agents for the selective immobilization of enzymes. 

Additionally, Fab' fragments can be directly recovered from E. coli and chemically 
coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 175:217-225 (1992) describe 
the production of a fully humanized bispecific antibody F(ab') 2 molecule. Each Fab 5 fragment 
was separately secreted from E. coli and subjected to directed chemical coupling in vitro to form 
the bispecific antibody. The bispecific antibody thus formed was able to bind to cells 
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overexpressing the ErbB2 receptor and normal human T cells, as well as trigger the lytic activity 
of human cytotoxic lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments directly from 
recombinant cell culture have also been described. For example, bispecific antibodies have been 
5 produced using leucine zippers. Kostelny et aL, Immunol 148(5):1547-1553 (1992). The 
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab' portions of two 
different antibodies by gene fusion. The antibody homodimers were reduced at the hinge region 
to form monomers and then re-oxidized to form the antibody heterodimers. This method can also 
be utilized for the production of antibody homodimers. The "diabody" technology described by 
W Hollinger et aL, Proc. Natl. Acad. Set USA 90:6444-6448 (1993) has provided an alternative 
q mechanism for making bispecific antibody fragments. The fragments comprise a heavy-chain 
jj j variable domain (Vh) connected to a light-chain variable domain (Vl) by a linker which is too 
01 short to allow pairing between the two domains on the same chain. Accordingly, the V H and Vl 
m domains of one fragment are forced to pair with the complementary V L and V H domains of 
|5 another fragment, thereby forming two antigen-binding sites. Another strategy for making 
jo* bispecific antibody fragments by the use of single-chain Fv (sFv) dimers has also been reported. 

See, Gruber et aL, J. Immunol 152:5368 (1994). 
H Antibodies with more than two valencies are contemplated. For example, trispecific 

antibodies can be prepared. Tutt et aL, J. Immunol 147:60(1991). 
20 Exemplary bispecific antibodies can bind to two different epitopes, at least one of which 

originates in the protein antigen of the invention. Alternatively, an anti-antigenic arm of an 
immunoglobulin molecule can be combined with an arm which binds to a triggering molecule on 
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7), or Fc receptors for 
IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRIII (CD16) so as to focus cellular 
25 defense mechanisms to the cell expressing the particular antigen. Bispecific antibodies can also 
be used to direct cytotoxic agents to cells which express a particular antigen. These antibodies 
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a radionuclide 
chelator, such as EOTUBE, DPTA, DOT A, or TETA. Another bispecific antibody of interest 
binds the protein antigen described herein and further binds tissue factor (TF). 
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Heteroconjugate Antibodies 

Heteroconjugate antibodies are also within the scope of the present invention, 
Heteroconjugate antibodies are composed of two covalently joined antibodies. Such antibodies 
have, for example, been proposed to target immune system cells to unwanted cells (U.S. Patent 
5 No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373; EP 03089). 
It is contemplated that the antibodies can be prepared in vitro using known methods in synthetic 
protein chemistry, including those involving crosslinking agents. For example, immunotoxins can 
be constructed using a disulfide exchange reaction or by forming a thioether bond. Examples of 
suitable reagents for this purpose include iminothiolate and methyl-4-mercaptobutyrimidate and 
lK those disclosed, for example, in U.S. Patent No. 4,676,980. 

m Effector Function Engineering 

i y 

fli It can be desirable to modify the antibody of the invention with respect to effector 

J: function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For example, 

■ cysteine residue(s) can be introduced into the Fc region, thereby allowing interchain disulfide 

Li 

15 bond formation in this region. The homodimeric antibody thus generated can have improved 
J* internalization capability and/or increased complement-mediated cell killing and antibody- 

O dependent cellular cytotoxicity (ADCC). See Caron et al, J. Exp Med., 176: 1 191-1 195 (1992) 

fjj 

and Shopes, J. Immunol, 148: 2918-2922 (1992). Homodimeric antibodies with enhanced anti- 
tumor activity can also be prepared using heterobifunctional cross-linkers as described in Wolff et 
20 al Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can be engineered that has 
dual Fc regions and can thereby have enhanced complement lysis and ADCC capabilities. See 
Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

Immunocon j ugates 

The invention also pertains to immunoconjugates comprising an antibody conjugated to a 
25 cytotoxic agent such as a chemo therapeutic agent, toxin (e.g., an enzymatically active toxin of 
bacterial, fungal, plant, or animal origin, or fragments thereof), or a radioactive isotope (i.e., a 
radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have been 
described above. Enzymatically active toxins and fragments thereof that can be used include 
30 diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin A chain (from 
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Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-sarcin, Aleurites 
fordii proteins, dianthin proteins, Phytolaca americana proteins (PAPI, PAPIL and PAP-S), 
momordica charantia inhibitor, curcin, crotin, sapaonaria officinalis inhibitor, gelonin, mitogellin, 
restrictocin, phenomycin, enomycin, and the tricothecenes. A variety of radionuclides are 
available for the production of radioconjugated antibodies. Examples include 212 Bi, 131 I, ,3l In, 
90 Y,and l86 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of bifunctional 
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate (SPDP), 
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl adipimidate HCL), 
active esters (such as disuccinimidyl suberate), aldehydes (such as glutareldehyde), bis-azido 
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium derivatives (such as bis- 
(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), and 
bis-active fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a ricin 

CSS 

j: immunotoxin can be prepared as described in Vitetta et al, Science, 238: 1098 (1987). Carbon- 

£0 

15 14-labeled l-isothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid (MX-DTPA) is 
Tl an exemplary chelating agent for conjugation of radionucleotide to the antibody. See 
^ WO94/11026. 

Q In another embodiment, the antibody can be conjugated to a "receptor" (such streptavidin) 

for utilization in tumor pretargeting wherein the antibody-receptor conjugate is administered to 

20 the patient, followed by removal of unbound conjugate from the circulation using a clearing agent 
and then administration of a "ligand" (e.g., avidin) that is in turn conjugated to a cytotoxic agent. 

In one embodiment, methods for the screening of antibodies that possess the desired 
specificity include, but are not limited to, enzyme-linked immunosorbent assay (ELISA) and other 
immunologically-mediated techniques known within the art. In a specific embodiment, selection 

25 of antibodies that are specific to a particular domain of an NOVX protein is facilitated by 
generation of hybridomas that bind to the fragment of an NOVX protein possessing such a 
domain. Thus, antibodies that are specific for a desired domain within an NOVX protein, or 
derivatives, fragments, analogs or homologs thereof, are also provided herein. 

Anti-NOVX antibodies may be used in methods known within the art relating to the 

30 localization and/or quantitation of an NOVX protein (e.g., for use in measuring levels of the 

NOVX protein within appropriate physiological samples, for use in diagnostic methods, for use in 
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imaging the protein, and the like). In a given embodiment, antibodies for NOVX proteins, or 
derivatives, fragments, analogs or homologs thereof, that contain the antibody derived binding 
domain, are utilized as pharmacologically-active compounds (hereinafter "Therapeutics"). 

An anti-NOVX antibody {e.g., monoclonal antibody) can be used to isolate an NOVX 
5 polypeptide by standard techniques, such as affinity chromatography or immunoprecipitation. An 
anti-NOVX antibody can facilitate the purification of natural NOVX polypeptide from cells and 
of recombinantly-produced NOVX polypeptide expressed in host cells. Moreover, an anti-NOVX 
antibody can be used to detect NOVX protein (e.g., in a cellular lysate or cell supernatant) in 
order to evaluate the abundance and pattern of expression of the NOVX protein. Anti-NOVX 
I® antibodies can be used diagnostically to monitor protein levels in tissue as part of a clinical testing 
"It procedure, e.g., to, for example, determine the efficacy of a given treatment regimen. Detection 
U1 can be facilitated by coupling (i.e., physically linking) the antibody to a detectable substance. 
m Examples of detectable substances include various enzymes, prosthetic groups, fluorescent 

=33 

J: materials, luminescent materials, bioluminescent materials, and radioactive materials. Examples 

15 of suitable enzymes include horseradish peroxidase, alkaline phosphatase, p-galactosidase, or 

l! acetylcholinesterase; examples of suitable prosthetic group complexes include streptavidin/biotin 

n and avidin/biotin; examples of suitable fluorescent materials include umbel liferone, fluorescein, 

O fluorescein isothiocyanate, rhodamine, dichlorotriazinylamine fluorescein, dansyl chloride or 

ru 

phycoerythrin; an example of a luminescent material includes luminol; examples of 
20 bioluminescent materials include luciferase, luciferin, and aequorin, and examples of suitable 
radioactive material include 125 1, 13l I, 35 S or 3 H. 

NOVX Recombinant Expression Vectors and Host Cells 

Another aspect of the invention pertains to vectors, preferably expression vectors, 
containing a nucleic acid encoding an NOVX protein, or derivatives, fragments, analogs or 

25 homologs thereof. As used herein, the term "vector" refers to a nucleic acid molecule capable of 
transporting another nucleic acid to which it has been linked. One type of vector is a "plasmid", 
which refers to a circular double stranded DNA loop into which additional DNA segments can be 
ligated. Another type of vector is a viral vector, wherein additional DNA segments can be ligated 
into the viral genome. Certain vectors are capable of autonomous replication in a host cell into 

30 which they are introduced (e.g., bacterial vectors having a bacterial origin of replication and 
episomal mammalian vectors). Other vectors (e.g., non-episomal mammalian vectors) are 
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integrated into the genome of a host cell upon introduction into the host cell, and thereby are 
replicated along with the host genome. Moreover, certain vectors are capable of directing the 
expression of genes to which they are operatively-linked. Such vectors are referred to herein as 
"expression vectors". In general, expression vectors of utility in recombinant DNA techniques are 
5 often in the form of plasmids. In the present specification, "plasmid" and "vector" can be used 
interchangeably as the plasmid is the most commonly used form of vector. However, the 
invention is intended to include such other forms of expression vectors, such as viral vectors (e.g., 
replication defective retroviruses, adenoviruses and adeno-associated viruses), which serve 
equivalent functions. 

I q The recombinant expression vectors of the invention comprise a nucleic acid of the 

S invention in a form suitable for expression of the nucleic acid in a host cell, which means that the 
55 recombinant expression vectors include one or more regulatory sequences, selected on the basis of 
m the host cells to be used for expression, that is operatively-linked to the nucleic acid sequence to 
■P be expressed. Within a recombinant expression vector, "operably-linked" is intended to mean that 
15 the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner that allows 
for expression of the nucleotide sequence (e.g. , in an in vitro transcription/translation system or in 
a host cell when the vector is introduced into the host cell), 
p The term "regulatory sequence" is intended to includes promoters, enhancers and other 

PJ expression control elements (e.g. , polyadenylation signals). Such regulatory sequences are 
20 described, for example, in Goeddel, Gene Expression Technology: Methods in Enzymology 
185, Academic Press, San Diego, Calif. (1990). Regulatory sequences include those that direct 
constitutive expression of a nucleotide sequence in many types of host cell and those that direct 
expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory 
sequences). It will be appreciated by those skilled in the art that the design of the expression 
25 vector can depend on such factors as the choice of the host cell to be transformed, the level of 

expression of protein desired, etc. The expression vectors of the invention can be introduced into 
host cells to thereby produce proteins or peptides, including fusion proteins or peptides, encoded 
by nucleic acids as described herein (e.g., NOVX proteins, mutant forms of NOVX proteins, 
fusion proteins, etc.). 

30 The recombinant expression vectors of the invention can be designed for expression of 

NOVX proteins in prokaryotic or eukaryotic cells. For example, NOVX proteins can be 
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expressed in bacterial cells such as Escherichia coli, insect cells (using baculovirus expression 
vectors) yeast cells or mammalian cells. Suitable host cells are discussed further in Goeddel, 
Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego, 
Calif. (1990). Alternatively, the recombinant expression vector can be transcribed and translated 
5 in vitro, for example using T7 promoter regulatory sequences and T7 polymerase. 

Expression of proteins in prokaryotes is most often carried out in Escherichia coli with 
vectors containing constitutive or inducible promoters directing the expression of either fusion or 
non- fusion proteins. Fusion vectors add a number of amino acids to a protein encoded therein, 
usually to the amino teraiinus of the recombinant protein. Such fusion vectors typically serve 
!Q three purposes: (t) to increase expression of recombinant protein; (ii) to increase the solubility of 
y the recombinant protein; and (Hi) to aid in the purification of the recombinant protein by acting as 
U1 a ligand in affinity purification. Often, in fusion expression vectors, a proteolytic cleavage site is 
m introduced at the junction of the fusion moiety and the recombinant protein to enable separation of 

SB! 

J: the recombinant protein from the fusion moiety subsequent to purification of the fusion protein. 
15 Such enzymes, and their cognate recognition sequences, include Factor Xa, thrombin and 
71 enterokinase. Typical fusion expression vectors include pGEX (Pharmacia Biotech Inc; Smith 
and Johnson, 1988. Gene 67: 31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 

rn 

H (Pharmacia, Piscataway, NJ.) that fuse glutathione S-transferase (GST), maltose E binding 

1 " protein, or protein A, respectively, to the target recombinant protein. 

20 Examples of suitable inducible non-fusion E. coli expression vectors include pTrc 

(Amrann et al, (1988) Gene 69:301-315) and pET 1 Id (Studier et al. 9 Gene Expression 
Technology: Methods in Enzymology 185, Academic Press, San Diego, Calif. (1990) 60-89). 

One strategy to maximize recombinant protein expression in E. coli is to express the 
protein in a host bacteria with an impaired capacity to proteolytically cleave the recombinant 

25 protein. See, e.g., Gottesman, Gene Expression Technology: Methods in Enzymology 185, 
Academic Press, San Diego, Calif. (1990) 1 19-128. Another strategy is to alter the nucleic acid 
sequence of the nucleic acid to be inserted into an expression vector so that the individual codons 
for each amino acid are those preferentially utilized in E. coli (see, e.g., Wada, et al, 1992. Nucl 
Acids Res. 20: 21 1 1-21 1 8). Such alteration of nucleic acid sequences of the invention can be 

30 carried out by standard DNA synthesis techniques. 
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In another embodiment, the NOVX expression vector is a yeast expression vector. 
Examples of vectors for expression in yeast Saccharomyces cerivisae include pYepSecl (Baldari, 
et ah, 1987. EMBO 1 6: 229-234), pMFa (Kurjan and Herskowitz, 1982. Cell 30: 933-943), 
pJRY88 (Schultz et aL, 1987. Gene 54: 1 13-123), pYES2 (Invitrogen Corporation, San Diego, 
Calif), and picZ (InVitrogen Corp, San Diego, Calif). 

Alternatively, NOVX can be expressed in insect cells using baculovirus expression 
vectors. Baculovirus vectors available for expression of proteins in cultured insect cells (e.g., SF9 
cells) include the pAc series (Smith, et aL, 1983. Mol Cell. Biol 3: 2156-2165) and the pVL 
series (Lucklow and Summers, 1989. Virology 170: 31-39). 

In yet another embodiment, a nucleic acid of the invention is expressed in mammalian 
cells using a mammalian expression vector. Examples of mammalian expression vectors include 
pCDM8 (Seed, 1987. Nature 329: 840) and pMT2PC (Kaufman, et aL, 1987. EMBO J. 6: 
187-195). When used in mammalian cells, the expression vector's control functions are often 
provided by viral regulatory elements. For example, commonly used promoters are derived from 
polyoma, adenovirus 2, cytomegalovirus, and simian virus 40. For other suitable expression 
systems for both prokaryotic and eukaryotic cells see, e.g., Chapters 16 and 17 of Sambrook, et 
aL, Molecular Cloning: A Laboratory Manual. 2nd ed., Cold Spring Harbor Laboratory, 
Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989. 

In another embodiment, the recombinant mammalian expression vector is capable of 
directing expression of the nucleic acid preferentially in a particular cell type (e.g., tissue-specific 
regulatory elements are used to express the nucleic acid). Tissue-specific regulatory elements are 
known in the art. Non-limiting examples of suitable tissue-specific promoters include the 
albumin promoter (liver-specific; Pinkert, et aL, 1987. Genes Dev. 1: 268-277), lymphoid-specific 
promoters (Calame and Eaton, 1988. Adv. Immunol. 43: 235-275), in particular promoters of T 
cell receptors (Winoto and Baltimore, 1989. EMBO J. 8: 729-733) and immunoglobulins (Banerji, 
et aL, 1983. Cell 33: 729-740; Queen and Baltimore, 1983. Cell 33: 741-748), neuron-specific 
promoters (e.g., the neurofilament promoter; Byrne and Ruddle, 1989. Proe. Natl Acad. Sci. USA 
86: 5473-5477), pancreas-specific promoters (Edlund, et aL, 1985. Science 230: 912-916), and 
mammary gland-specific promoters (e.g., milk whey promoter; U.S. Pat. No. 4,873,316 and 
European Application Publication No. 264,166). Developmentally-regulated promoters are also 
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encompassed, e.g., the murine hox promoters (Kessel and Gruss, 1990. Science 249: 374-379) and 
the a-fetoprotein promoter (Campes and Tilghman, 1989. Genes Dev. 3: 537-546). 

The invention further provides a recombinant expression vector comprising a DNA 
molecule of the invention cloned into the expression vector in an antisense orientation. That is, 
5 the DNA molecule is operatively-linked to a regulatory sequence in a manner that allows for 
expression (by transcription of the DNA molecule) of an RNA molecule that is antisense to 
NOVX mRNA. Regulatory sequences operatively linked to a nucleic acid cloned in the antisense 
orientation can be chosen that direct the continuous expression of the antisense RNA molecule in 
a variety of cell types, for instance viral promoters and/or enhancers, or regulatory sequences can 

f@ be chosen that direct constitutive, tissue specific or cell type specific expression of antisense 

RNA. The antisense expression vector can be in the form of a recombinant plasmid, phagemid or 
attenuated virus in which antisense nucleic acids are produced under the control of a high 

m efficiency regulatory region, the activity of which can be determined by the cell type into which 

J the vector is introduced. For a discussion of the regulation of gene expression using antisense 

|5 genes see, e.g., Weintraub, et aL, "Antisense RNA as a molecular tool for genetic analysis," 

y~ Reviews-Trends in Genetics, Vol. 1(1) 1986. 

Another aspect of the invention pertains to host cells into which a recombinant expression 

U vector of the invention has been introduced. The terms "host cell" and "recombinant host cell" are 
used interchangeably herein. It is understood that such terms refer not only to the particular 

20 subject cell but also to the progeny or potential progeny of such a cell. Because certain 

modifications may occur in succeeding generations due to either mutation or environmental 
influences, such progeny may not, in fact, be identical to the parent cell, but are still included 
within the scope of the term as used herein. 

A host cell can be any prokaryotic or eukaryotic cell. For example, NOVX protein can be 

25 expressed in bacterial cells such as E. coli, insect cells, yeast or mammalian cells (such as Chinese 
hamster ovary cells (CHO) or COS cells). Other suitable host cells are known to those skilled in 
the art. 

Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional 
transformation or transfection techniques. As used herein, the terms "transformation" and 
30 "transfection" are intended to refer to a variety of art-recognized techniques for introducing 

foreign nucleic acid {e.g., DNA) into a host cell, including calcium phosphate or calcium chloride 
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co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation. Suitable 
methods for transforming or transfecting host cells can be found in Sambrook, et al. (Molecular 
Cloning: A Laboratory Manual. 2nd ed., Cold Spring Harbor Laboratory, Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989), and other laboratory manuals. 

For stable transfection of mammalian cells, it is known that, depending upon the 
expression vector and transfection technique used, only a small fraction of cells may integrate the 
foreign DNA into their genome. In order to identify and select these integrants, a gene that 
encodes a selectable marker {e.g., resistance to antibiotics) is generally introduced into the host 
cells along with the gene of interest. Various selectable markers include those that confer 
resistance to drugs, such as G418, hygromycin and methotrexate. Nucleic acid encoding a 
selectable marker can be introduced into a host cell on the same vector as that encoding NOVX or 
can be introduced on a separate vector. Cells stably transfected with the introduced nucleic acid 
can be identified by drug selection {e.g., cells that have incorporated the selectable marker gene 
will survive, while the other cells die). 

A host cell of the invention, such as a prokaryotic or eukaryotic host cell in culture, can be 
used to produce {i.e., express) NOVX protein. Accordingly, the invention further provides 
methods for producing NOVX protein using the host cells of the invention. In one embodiment, 
the method comprises culturing the host cell of invention (into which a recombinant expression 
vector encoding NOVX protein has been introduced) in a suitable medium such that NOVX 
protein is produced. In another embodiment, the method further comprises isolating NOVX 
protein from the medium or the host cell. 

Transgenic NOVX Animals 

The host cells of the invention can also be used to produce non-human transgenic animals. 
For example, in one embodiment, a host cell of the invention is a fertilized oocyte or an 
embryonic stem cell into which NOVX protein-coding sequences have been introduced. Such host 
cells can then be used to create non-human transgenic animals in which exogenous NOVX 
sequences have been introduced into their genome or homologous recombinant animals in which 
endogenous NOVX sequences have been altered. Such animals are useful for studying the 
function and/or activity of NOVX protein and for identifying and/or evaluating modulators of 
NOVX protein activity. As used herein, a ''transgenic animal" is a non-human animal, preferably 
a mammal, more preferably a rodent such as a rat or mouse, in which one or more of the cells of 
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the animal includes a transgene. Other examples of transgenic animals include non-human 
primates, sheep, dogs, cows, goats, chickens, amphibians, etc. A transgene is exogenous DNA 
that is integrated into the genome of a cell from which a transgenic animal develops and that 
remains in the genome of the mature animal, thereby directing the expression of an encoded gene 
5 product in one or more cell types or tissues of the transgenic animal As used herein, a 

"homologous recombinant animal" is a non-human animal, preferably a mammal, more preferably 
a mouse, in which an endogenous NOVX gene has been altered by homologous recombination 
between the endogenous gene and an exogenous DNA molecule introduced into a cell of the 
animal, e.g., an embryonic cell of the animal, prior to development of the animal. 
IQ A transgenic animal of the invention can be created by introducing NOVX-encoding 

nucleic acid into the male pronuclei of a fertilized oocyte (e.g., by microinjection, retroviral 

i i 

IJ1 infection) and allowing the oocyte to develop in a pseudopregnant female foster animal The 
m human NOVX cDNA sequences SEQ ID NOS:l, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 
J; 29can be introduced as a transgene into the genome of a non-human animal Alternatively, a non- 
J 5 human homologue of the human NOVX gene, such as a mouse NOVX gene, can be isolated 
£7 based on hybridization to the human NOVX cDNA (described further supra) and used as a 

transgene. Intronic sequences and polyadenylation signals can also be included in the transgene 
H to increase the efficiency of expression of the transgene. A tissue-specific regulatory sequence(s) 
!y can be operably-linked to the NOVX transgene to direct expression of NOVX protein to particular 
20 cells. Methods for generating transgenic animals via embryo manipulation and microinjection, 
particularly animals such as mice, have become conventional in the art and are described, for 
example, in U.S. Patent Nos. 4,736,866; 4,870,009; and 4,873,191; and Hogan, 1986. In: 
Manipulating THE MOUSE Embryo, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, 
N.Y. Similar methods are used for production of other transgenic animals. A transgenic founder 
25 animal can be identified based upon the presence of the NOVX transgene in its genome and/or 
expression of NOVX mRNA in tissues or cells of the animals. A transgenic founder animal can 
then be used to breed additional animals carrying the transgene. Moreover, transgenic animals 
carrying a transgene-encoding NOVX protein can further be bred to other transgenic animals 
carrying other transgenes. 

30 To create a homologous recombinant animal, a vector is prepared which contains at least a 

portion of an NOVX gene into which a deletion, addition or substitution has been introduced to 
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thereby alter, e.g., functionally disrupt, the NOVX gene. The NOVX gene can be a human gene 
(e.g., the cDNA of SEQ ID NOSrl , 3, 5, 7, 9, 1 1, 13, 15, 17, 19, 21, 23, 25, 27 and 29), but more 
preferably, is a non-human homologue of a human NOVX gene. For example, a mouse 
homologue of human NOVX gene of SEQ ID NOS:l, 3, 5, 7, 9, 11,13, 15, 17, 19, 21, 23, 25, 27 
5 and 29can be used to construct a homologous recombination vector suitable for altering an 

endogenous NOVX gene in the mouse genome. In one embodiment, the vector is designed such 
that, upon homologous recombination, the endogenous NOVX gene is functionally disrupted (i.e., 
no longer encodes a functional protein; also referred to as a "knock out" vector). 

Alternatively, the vector can be designed such that, upon homologous recombination, the 
W endogenous NOVX gene is mutated or otherwise altered but still encodes functional protein (e.g., 

0 the upstream regulatory region can be altered to thereby alter the expression of the endogenous 

~t j NOVX protein). In the homologous recombination vector, the altered portion of the NOVX gene 

01 is flanked at its 5'- and 3'-termini by additional nucleic acid of the NOVX gene to allow for 

m homologous recombination to occur between the exogenous NOVX gene carried by the vector 
15 and an endogenous NOVX gene in an embryonic stem cell. The additional flanking NOVX 
M nucleic acid is of sufficient length for successful homologous recombination with the endogenous 
fQ gene. Typically, several kilobases of flanking DNA (both at the 5 1 - and 3-termini) are included in 
Jrj the vector. See, e.g., Thomas, et aL, 1987. Cell 51 : 503 for a description of homologous 

recombination vectors. The vector is ten introduced into an embryonic stem cell line (e.g., by 
20 electroporation) and cells in which the introduced NOVX gene has homologously-recombined 
with the endogenous NOVX gene are selected. See, e.g., Li, et al, 1992. Cell 69: 915. 

The selected cells are then injected into a blastocyst of an animal (e.g., a mouse) to form 
aggregation chimeras. See, e.g., Bradley, 1987. In: Teratocarcinomas and Embryonic Stem 
Cells: A Practical Approach, Robertson, ed. IRL, Oxford, pp. 113-152. A chimeric embryo 
25 can then be implanted into a suitable pseudopregnant female foster animal and the embryo 

brought to term. Progeny harboring the homologously-recombined DNA in their germ cells can 
be used to breed animals in which all cells of the animal contain the homologously-recombined 
DNA by germline transmission of the transgene. Methods for constructing homologous 
recombination vectors and homologous recombinant animals are described further in Bradley, 
30 1991. Curr. Opin. Biotechnol 2: 823-829; PCT International Publication Nos.: WO 90/1 1354; 
WO 91/01 140; WO 92/0968; and WO 93/04169. 
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In another embodiment, transgenic non-humans animals can be produced that contain 
selected systems that allow for regulated expression of the transgene. One example of such a 
system is the cre/loxP recombinase system of bacteriophage PI . For a description of the cre/loxP 
recombinase system, See, e.g., Lakso, et a/., 1992. Proc. Natl. Acad. Sci. USA 89: 6232-6236. 
5 Another example of a recombinase system is the FLP recombinase system of Saccharomyces 
cerevisiae. See, O'Gorman, et aL 9 1991. Science 251:1351-1355. If a cre/loxP recombinase 
system is used to regulate expression of the transgene, animals containing transgenes encoding 
both the Cre recombinase and a selected protein are required. Such animals can be provided 
. v through the construction of "double" transgenic animals, e.g., by mating two transgenic animals, 

ffi one containing a transgene encoding a selected protein and the other containing a transgene 

£3 

(jl encoding a recombinase. 

L'i Clones of the non-human transgenic animals described herein can also be produced 

y § 

<C according to the methods described in Wilmut, et al, 1997. Nature 385: 810-813. In brief, a cell 

fij 

I" {e.g., a somatic cell) from the transgenic animal can be isolated and induced to exit the growth 

S*> cycle and enter Go phase. The quiescent cell can then be fused, e.g., through the use of electrical 

M pulses, to an enucleated oocyte from an animal of the same species from which the quiescent cell 

f » is isolated. The reconstructed oocyte is then cultured such that it develops to morula or blastocyte 

^ and then transferred to pseudopregnant female foster animal. The offspring borne of this female 
foster animal will be a clone of the animal from which the cell (e.g., the somatic cell) is isolated. 

20 Pharmaceutical Compositions 

The NOVX nucleic acid molecules, NOVX proteins, and anti-NOVX antibodies (also 
referred to herein as "active compounds") of the invention, and derivatives, fragments, analogs 
and homologs thereof, can be incorporated into pharmaceutical compositions suitable for 
administration. Such compositions typically comprise the nucleic acid molecule, protein, or 

25 antibody and a pharmaceutical^ acceptable carrier. As used herein, "pharmaceutically acceptable 
carrier" is intended to include any and all solvents, dispersion media, coatings, antibacterial and 
antifungal agents, isotonic and absorption delaying agents, and the like, compatible with 
pharmaceutical administration. Suitable carriers are described in the most recent edition of 
Remington's Pharmaceutical Sciences, a standard reference text in the field, which is incorporated 

30 herein by reference. Preferred examples of such carriers or diluents include, but are not limited 
to, water, saline, finger's solutions, dextrose solution, and 5% human serum albumin. Liposomes 
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and non-aqueous vehicles such as fixed oils may also be used. The use of such media and agents 
for pharmaceutical^ active substances is well known in the art. Except insofar as any 
conventional media or agent is incompatible with the active compound, use thereof in the 
compositions is contemplated. Supplementary active compounds can also be incorporated into 
the compositions. 

A pharmaceutical composition of the invention is formulated to be compatible with its 
intended route of administration. Examples of routes of administration include parenteral, e.g., 
intravenous, intradermal, subcutaneous, oral {e.g., inhalation), transdermal {i.e., topical), 
transmucosal, and rectal administration. Solutions or suspensions used for parenteral, 
intradermal, or subcutaneous application can include the following components: a sterile diluent 
such as water for injection, saline solution, fixed oils, polyethylene glycols, glycerine, propylene 
glycol or other synthetic solvents; antibacterial agents such as benzyl alcohol or methyl parabens; 
antioxidants such as ascorbic acid or sodium bisulfite; chelating agents such as 
ethylenediaminetetraacetic acid (EDTA); buffers such as acetates, citrates or phosphates, and 
agents for the adjustment of tonicity such as sodium chloride or dextrose. The pH can be adjusted 
with acids or bases, such as hydrochloric acid or sodium hydroxide. The parenteral preparation 
can be enclosed in ampoules, disposable syringes or multiple dose vials made of glass or plastic. 

Pharmaceutical compositions suitable for injectable use include sterile aqueous solutions 
(where water soluble) or dispersions and sterile powders for the extemporaneous preparation of 
sterile injectable solutions or dispersion. For intravenous administration, suitable carriers include 
physiological saline, bacteriostatic water, Cremophor EL™ (BASF, Parsippany, N J.) or phosphate 
buffered saline (PBS). In all cases, the composition must be sterile and should be fluid to the 
extent that easy syringeability exists. It must be stable under the conditions of manufacture and 
storage and must be preserved against the contaminating action of microorganisms such as 
bacteria and fungi. The carrier can be a solvent or dispersion medium containing, for example, 
water, ethanol, polyol (for example, glycerol, propylene glycol, and liquid polyethylene glycol, 
and the like), and suitable mixtures thereof. The proper fluidity can be maintained, for example, 
by the use of a coating such as lecithin, by the maintenance of the required particle size in the case 
of dispersion and by the use of surfactants. Prevention of the action of microorganisms can be 
achieved by various antibacterial and antifungal agents, for example, parabens, chlorobutanol, 
phenol, ascorbic acid, thimerosal, and the like. In many cases, it will be preferable to include 
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isotonic agents, for example, sugars, polyalcohols such as manitol, sorbitol, sodium chloride in 
the composition. Prolonged absorption of the injectable compositions can be brought about by 
including in the composition an agent which delays absorption, for example, aluminum 
monostearate and gelatin. 

Sterile injectable solutions can be prepared by incorporating the active compound (e.g., an 
NOVX protein or anti-NOVX antibody) in the required amount in an appropriate solvent with one 
or a combination of ingredients enumerated above, as required, followed by filtered sterilization. 
Generally, dispersions are prepared by incorporating the active compound into a sterile vehicle 
that contains a basic dispersion medium and the required other ingredients from those enumerated 
above. In the case of sterile powders for the preparation of sterile injectable solutions, methods of 
preparation are vacuum drying and freeze-drying that yields a powder of the active ingredient plus 
any additional desired ingredient from a previously sterile- filtered solution thereof. 

Oral compositions generally include an inert diluent or an edible carrier. They can be 
enclosed in gelatin capsules or compressed into tablets. For the purpose of oral therapeutic 
administration, the active compound can be incorporated with excipients and used in the form of 
tablets, troches, or capsules. Oral compositions can also be prepared using a fluid carrier for use 
as a mouthwash, wherein the compound in the fluid carrier is applied orally and swished and 
expectorated or swallowed. Pharmaceutically compatible binding agents, and/or adjuvant 
materials can be included as part of the composition. The tablets, pills, capsules, troches and the 
like can contain any of the following ingredients, or compounds of a similar nature: a binder such 
as microcrystalline cellulose, gum tragacanth or gelatin; an excipient such as starch or lactose, a 
disintegrating agent such as alginic acid, Primogel, or corn starch; a lubricant such as magnesium 
stearate or Sterotes; a glidant such as colloidal silicon dioxide; a sweetening agent such as sucrose 
or saccharin; or a flavoring agent such as peppermint, methyl salicylate, or orange flavoring. 

For administration by inhalation, the compounds are delivered in the form of an aerosol 
spray from pressured container or dispenser which contains a suitable propellant, e.g., a gas such 
as carbon dioxide, or a nebulizer. 

Systemic administration can also be by transmucosal or transdermal means. For 
transmucosal or transdermal administration, penetrants appropriate to the barrier to be permeated 
are used in the formulation. Such penetrants are generally known in the art, and include, for 
example, for transmucosal administration, detergents, bile salts, and fusidic acid derivatives. 
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Transmucosal administration can be accomplished through the use of nasal sprays or 
suppositories. For transdermal administration, the active compounds are formulated into 
ointments, salves, gels, or creams as generally known in the art. 

The compounds can also be prepared in the form of suppositories (e.g., with conventional 
5 suppository bases such as cocoa butter and other glycerides) or retention enemas for rectal 
delivery. 

In one embodiment, the active compounds are prepared with carriers that will protect the 
compound against rapid elimination from the body, such as a controlled release formulation, 
including implants and microencapsulated delivery systems. Biodegradable, biocompatible 
lfh polymers can be used, such as ethylene vinyl acetate, polyanhydrides, polyglycolic acid, collagen, 
y polyorthoesters, and polylactic acid. Methods for preparation of such formulations will be 
IU apparent to those skilled in the art. The materials can also be obtained commercially from Alza 

si 

jp Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions (including liposomes 
targeted to infected cells with monoclonal antibodies to viral antigens) can also be used as 

Id pharmaceutically acceptable carriers. These can be prepared according to methods known to 

L those skilled in the art, for example, as described in U.S. Patent No. 4,522,81 1. 

Jjf It is especially advantageous to formulate oral or parenteral compositions in dosage unit 

pj form for ease of administration and uniformity of dosage. Dosage unit form as used herein refers 
to physically discrete units suited as unitary dosages for the subject to be treated; each unit 

20 containing a predetermined quantity of active compound calculated to produce the desired 

therapeutic effect in association with the required pharmaceutical carrier. The specification for 
the dosage unit forms of the invention are dictated by and directly dependent on the unique 
characteristics of the active compound and the particular therapeutic effect to be achieved, and the 
limitations inherent in the art of compounding such an active compound for the treatment of 

25 individuals. 

The nucleic acid molecules of the invention can be inserted into vectors and used as gene 
therapy vectors. Gene therapy vectors can be delivered to a subject by, for example, intravenous 
injection, local administration (see, e.g., U.S. Patent No. 5,328,470) or by stereotactic injection 
(see, e.g., Chen, et al. 9 1994. Proc. Natl. Acad. Set USA 91: 3054-3057). The pharmaceutical 
30 preparation of the gene therapy vector can include the gene therapy vector in an acceptable 

diluent, or can comprise a slow release matrix in which the gene delivery vehicle is imbedded. 
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Alternatively, where the complete gene delivery vector can be produced intact from recombinant 
cells, e.g., retroviral vectors, the pharmaceutical preparation can include one or more cells that 
produce the gene delivery system. 

The pharmaceutical compositions can be included in a container, pack, or dispenser 
together with instructions for administration. 

Screening and Detection Methods 

The isolated nucleic acid molecules of the invention can be used to express NOVX protein 
(e.g., via a recombinant expression vector in a host cell in gene therapy applications), to detect 
NOVX mRNA (e.g., in a biological sample) or a genetic lesion in an NOVX gene, and to 
modulate NOVX activity, as described further, below. In addition, the NOVX proteins can be 
used to screen drugs or compounds that modulate the NOVX protein activity or expression as well 
as to treat disorders characterized by insufficient or excessive production of NOVX protein or 
production of NOVX protein forms that have decreased or aberrant activity compared to NOVX 
wild-type protein (e.g.; diabetes (regulates insulin release); obesity (binds and transport lipids); 
metabolic disturbances associated with obesity, the metabolic syndrome X as well as anorexia 
and wasting disorders associated with chronic diseases and various cancers, and infectious 
disease(possesses anti-microbial activity) and the various dyslipidemias. In addition, the 
anti-NOVX antibodies of the invention can be used to detect and isolate NOVX proteins and 
modulate NOVX activity. In yet a further aspect, the invention can be used in methods to 
influence appetite, absorption of nutrients and the disposition of metabolic substrates in both a 
positive and negative fashion. 

The invention further pertains to novel agents identified by the screening assays described 
herein and uses thereof for treatments as described, supra. 

Screening Assays 

The invention provides a method (also referred to herein as a "screening assay") for 
identifying modulators, i.e., candidate or test compounds or agents (e.g., peptides, 
peptidomimetics, small molecules or other drugs) that bind to NOVX proteins or have a 
stimulatory or inhibitory effect on, e.g., NOVX protein expression or NOVX protein activity. The 
invention also includes compounds identified in the screening assays described herein. 
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In one embodiment, the invention provides assays for screening candidate or test 
compounds which bind to or modulate the activity of the membrane-bound form of an NOVX 
protein or polypeptide or biologically-active portion thereof. The test compounds of the invention 
can be obtained using any of the numerous approaches in combinatorial library methods known in 
5 the art, including: biological libraries; spatially addressable parallel solid phase or solution phase 
libraries; synthetic library methods requiring deconvolution; the "one-bead one-compound" 
library method; and synthetic library methods using affinity chromatography selection. The 
biological library approach is limited to peptide libraries, while the other four approaches are 
applicable to peptide, non-peptide oligomer or small molecule libraries of compounds. See, e.g., 
ffi Lam, 1997 Anticancer Drug Design 12: 145. 

If: A "small molecule" as used herein, is meant to refer to a composition that has a molecular 

J:'-" weight of less than about 5 kD and most preferably less than about 4 kD. Small molecules can be, 
£ e.g., nucleic acids, peptides, polypeptides, peptidomimetics, carbohydrates, lipids or other organic 

m 

„ or inorganic molecules. Libraries of chemical and/or biological mixtures, such as fungal, 
§=§ bacterial, or algal extracts, are known in the art and can be screened with any of the assays of the 
invention. 

Examples of methods for the synthesis of molecular libraries can be found in the art, for 
W example in: DeWitt, et al, 1993. Proc. Natl. Acad. Sci. U.S.A. 90: 6909; Erb, et al, 1994. Proc. 

Natl. Acad. Sci. U.S.A. 91 : 1 1422; Zuckermann, et al., 1994. J. Med. Chem. 37: 2678; Cho, et al., 
20 1993. Science 261: 1303; Carrell, et al, 1994. Angew. Chem. Int. Ed. Engl. 33: 2059; Carell, et 

al, 1994. Angew. Chem. Int. Ed. Engl. 33: 2061; and Gallop, et al, 1994. J. Med. Chem. 37: 

1233. 

Libraries of compounds may be presented in solution (e.g., Houghten, 1992. 
Biotechniques 13: 412-421), or on beads (Lam, 1991. Nature 354: 82-84), on chips (Fodor, 1993. 

25 Nature 364: 555-556), bacteria (Ladner, U.S. Patent No. 5,223,409), spores (Ladner, U.S. Patent 
5,233,409), plasmids (Cull, et al, 1992. Proc. Natl. Acad. Sci. USA 89: 1865-1869) or on phage 
(Scott and Smith, 1990. Science 249: 386-390; Devlin, 1990. Science 249: 404-406; Cwirla, et al, 
1990. Proc. Natl. Acad. Sci. U.S.A. 87: 6378-6382; Felici, 1991. J. Mol Biol. 222: 301-310; 
Ladner, U.S. Patent No. 5,233,409.). 

30 In one embodiment, an assay is a cell-based assay in which a cell which expresses a 

membrane-bound form of NOVX protein, or a biologically-active portion thereof, on the cell 
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surface is contacted with a test compound and the ability of the test compound to bind to an 
NOVX protein determined. The cell, for example, can of mammalian origin or a yeast cell. 
Determining the ability of the test compound to bind to the NOVX protein can be accomplished, 
for example, by coupling the test compound with a radioisotope or enzymatic label such that 
binding of the test compound to the NOVX protein or biologically-active portion thereof can be 
determined by detecting the labeled compound in a complex. For example, test compounds can 
be labeled with n % 3d S, 14 C, or 3 H, either directly or indirectly, and the radioisotope detected by 
direct counting of radioemission or by scintillation counting. Alternatively, test compounds can be 
enzymatically-labeled with, for example, horseradish peroxidase, alkaline phosphatase, or 
luciferase, and the enzymatic label detected by determination of conversion of an appropriate 
substrate to product. In one embodiment, the assay comprises contacting a cell which expresses a 
membrane-bound form of NOVX protein, or a biologically-active portion thereof, on the cell 
surface with a known compound which binds NOVX to form an assay mixture, contacting the 
assay mixture with a test compound, and determining the ability of the test compound to interact 
with an NOVX protein, wherein determining the ability of the test compound to interact with an 
NOVX protein comprises determining the ability of the test compound to preferentially bind to 
NOVX protein or a biologically-active portion thereof as compared to the known compound. 

In another embodiment, an assay is a cell-based assay comprising contacting a cell 
expressing a membrane-bound form of NOVX protein, or a biologically- active portion thereof, on 
the cell surface with a test compound and determining the ability of the test compound to 
modulate (e.g., stimulate or inhibit) the activity of the NOVX protein or biologically-active 
portion thereof. Determining the ability of the test compound to modulate the activity of NOVX 
or a biologically-active portion thereof can be accomplished, for example, by determining the 
ability of the NOVX protein to bind to or interact with an NOVX target molecule. As used 
herein, a "target molecule" is a molecule with which an NOVX protein binds or interacts in 
nature, for example, a molecule on the surface of a cell which expresses an NOVX interacting 
protein, a molecule on the surface of a second cell, a molecule in the extracellular milieu, a 
molecule associated with the internal surface of a cell membrane or a cytoplasmic molecule. An 
NOVX target molecule can be a non-NOVX molecule or an NOVX protein or polypeptide of the 
invention. In one embodiment, an NOVX target molecule is a component of a signal transduction 
pathway that facilitates transduction of an extracellular signal (e.g. a signal generated by binding 
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of a compound to a membrane-bound NOVX molecule) through the cell membrane and into the 
cell. The target, for example, can be a second intercellular protein that has catalytic activity or a 
protein that facilitates the association of downstream signaling molecules with NOVX. 

Determining the ability of the NOVX protein to bind to or interact with an NOVX target 

5 molecule can be accomplished by one of the methods described above for determining direct 
binding. In one embodiment, determining the ability of the NOVX protein to bind to or interact 
with an NOVX target molecule can be accomplished by determining the activity of the target 
molecule. For example, the activity of the target molecule can be determined by detecting 

M induction of a cellular second messenger of the target (i.e. intracellular Ca 2+ , diacylglycerol, IP 3 , 
1|§ etc.), detecting catalytic/enzymatic activity of the target an appropriate substrate, detecting the 

J! : induction of a reporter gene (comprising an NOVX-responsive regulatory element operatively 

Ql linked to a nucleic acid encoding a detectable marker, e.g., luciferase), or detecting a cellular 

J:; 

m response, for example, cell survival, cellular differentiation, or cell proliferation. 
* a In yet another embodiment, an assay of the invention is a cell-free assay comprising 

t$ contacting an NOVX protein or biologically- active portion thereof with a test compound and 

I?* determining the ability of the test compound to bind to the NOVX protein or biologically-active 

y portion thereof. Binding of the test compound to the NOVX protein can be determined either 

flj 

directly or indirectly as described above. In one such embodiment, the assay comprises 
contacting the NOVX protein or biologically-active portion thereof with a known compound 

20 which binds NOVX to form an assay mixture, contacting the assay mixture with a test compound, 
and determining the ability of the test compound to interact with an NOVX protein, wherein 
determining the ability of the test compound to interact with an NOVX protein comprises 
determining the ability of the test compound to preferentially bind to NOVX or biologically- 
active portion thereof as compared to the known compound. 

25 In still another embodiment, an assay is a cell-free assay comprising contacting NOVX 

protein or biologically-active portion thereof with a test compound and determining the ability of 
the test compound to modulate (e.g. stimulate or inhibit) the activity of the NOVX protein or 
biologically-active portion thereof. Determining the ability of the test compound to modulate the 
activity of NOVX can be accomplished, for example, by determining the ability of the NOVX 

30 protein to bind to an NOVX target molecule by one of the methods described above for 

determining direct binding. In an alternative embodiment, determining the ability of the test 
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compound to modulate the activity of NOVX protein can be accomplished by determining the 
ability of the NOVX protein further modulate an NOVX target molecule. For example, the 
catalytic/enzymatic activity of the target molecule on an appropriate substrate can be determined 
as described, supra. 

In yet another embodiment, the cell-free assay comprises contacting the NOVX protein or 
biologically-active portion thereof with a known compound which binds NOVX protein to form 
an assay mixture, contacting the assay mixture with a test compound, and determining the ability 
of the test compound to interact with an NOVX protein, wherein determining the ability of the test 
compound to interact with an NOVX protein comprises determining the ability of the NOVX 
protein to preferentially bind to or modulate the activity of an NOVX target molecule. 

The cell-free assays of the invention are amenable to use of both the soluble form or the 
membrane-bound form of NOVX protein. In the case of cell- free assays comprising the 
membrane-bound form of NOVX protein, it may be desirable to utilize a solubilizing agent such 
that the membrane-bound form of NOVX protein is maintained in solution. Examples of such 
solubilizing agents include non-ionic detergents such as n-octylglucoside, n-dodecylglucoside, 
n-dodecylmaltoside, octanoyl-N-methylglucamide, decanoyl-N-methylglucamide, Triton® X-100, 
Triton® X-114, Thesit®, Isotridecypoly(ethylene glycol ether) n , N-dodecyl— 
N,N-dimethyl-3-ammonio-l -propane sulfonate, 3-(3-cholamidopropyl) dimethylamminiol- 
1 -propane sulfonate (CHAPS), or 3-(3-cholamidopropyl)dimethylamminiol-2-hydroxy-l -propane 
sulfonate (CHAPSO). 

In more than one embodiment of the above assay methods of the invention, it may be 
desirable to immobilize either NOVX protein or its target molecule to facilitate separation of 
complexed from uncomplexed forms of one or both of the proteins, as well as to accommodate 
automation of the assay. Binding of a test compound to NOVX protein, or interaction of NOVX 
protein with a target molecule in the presence and absence of a candidate compound, can be 
accomplished in any vessel suitable for containing the reactants. Examples of such vessels 
include microtiter plates, test tubes, and micro-centrifuge tubes. In one embodiment, a fusion 
protein can be provided that adds a domain that allows one or both of the proteins to be bound to a 
matrix. For example, GST-NOVX fusion proteins or GST-target fusion proteins can be adsorbed 
onto glutathione sepharose beads (Sigma Chemical, St. Louis, MO) or glutathione derivatized 
microtiter plates, that are then combined with the test compound or the test compound and either 
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the non-adsorbed target protein or NOVX protein, and the mixture is incubated under conditions 
conducive to complex formation (e.g., at physiological conditions for salt and pH). Following 
incubation, the beads or microtiter plate wells are washed to remove any unbound components, 
the matrix immobilized in the case of beads, complex determined either directly or indirectly, for 
5 example, as described, supra. Alternatively, the complexes can be dissociated from the matrix, 
and the level of NOVX protein binding or activity determined using standard techniques. 

Other techniques for immobilizing proteins on matrices can also be used in the screening 
assays of the invention. For example, either the NOVX protein or its target molecule can be 
immobilized utilizing conjugation of biotin and streptavidin. Biotinylated NOVX protein or 
103 target molecules can be prepared from biotin-NHS (N-hydroxy-succinimide) using techniques 
|S well-known within the art (e.g., biotinylation kit, Pierce Chemicals, Rockford, 111.), and 
^ immobilized in the wells of streptavidin-coated 96 well plates (Pierce Chemical). Alternatively, 
JS antibodies reactive with NOVX protein or target molecules, but which do not interfere with 
~~ binding of the NOVX protein to its target molecule, can be derivatized to the wells of the plate, 
W and unbound target or NOVX protein trapped in the wells by antibody conjugation. Methods for 

sal: 

f=* detecting such complexes, in addition to those described above for the GST-immobilized 

complexes, include immunodetection of complexes using antibodies reactive with the NOVX 

W protein or target molecule, as well as enzyme- linked assays that rely on detecting an enzymatic 
activity associated with the NOVX protein or target molecule. 

20 In another embodiment, modulators of NOVX protein expression are identified in a 

method wherein a cell is contacted with a candidate compound and the expression of NOVX 
mRNA or protein in the cell is determined. The level of expression of NOVX mRNA or protein 
in the presence of the candidate compound is compared to the level of expression of NOVX 
mRNA or protein in the absence of the candidate compound. The candidate compound can then 

25 be identified as a modulator of NOVX mRNA or protein expression based upon this comparison. 
For example, when expression of NOVX mRNA or protein is greater (i.e., statistically 
significantly greater) in the presence of the candidate compound than in its absence, the candidate 
compound is identified as a stimulator of NOVX mRNA or protein expression. Alternatively, 
when expression of NOVX mRNA or protein is less (statistically significantly less) in the 

30 presence of the candidate compound than in its absence, the candidate compound is identified as 
an inhibitor of NOVX mRNA or protein expression. The level of NOVX mRNA or protein 
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expression in the cells can be determined by methods described herein for detecting NOVX 
mRNA or protein. 

In yet another aspect of the invention, the NOVX proteins can be used as "bait proteins" in 
a two-hybrid assay or three hybrid assay (see, e.g., U.S. Patent No. 5,283,317; Zervos, et al, 
5 1993. Cell 72: 223-232; Madura, et al 9 1993. J. Biol Chem. 268: 12046-12054; Bartel, et al. 9 

1993. Biotechniques 14: 920-924; Iwabuchi, et aL, 1993. Oncogene 8: 1693-1696; and Brent WO 
94/10300), to identify other proteins that bind to or interact with NOVX ("NOVX-binding 
proteins" or "NOVX-bp") and modulate NOVX activity. Such NOVX-binding proteins are also 
likely to be involved in the propagation of signals by the NOVX proteins as, for example, 
lgi upstream or downstream elements of the NOVX pathway. 

The two-hybrid system is based on the modular nature of most transcription factors, which 

y i 

FLJ consist of separable DNA-binding and activation domains. Briefly, the assay utilizes two 
J~ different DNA constructs. In one construct, the gene that codes for NOVX is fused to a gene 
^ encoding the DNA binding domain of a known transcription factor (e.g., GAL-4). In the other 
J5 construct, a DNA sequence, from a library of DNA sequences, that encodes an unidentified 
[7 protein ("prey" or "sample") is fused to a gene that codes for the activation domain of the known 

transcription factor. If the "bait" and the "prey" proteins are able to interact, in vivo, forming an 
fjj NOVX-dependent complex, the DNA-binding and activation domains of the transcription factor 

are brought into close proximity. This proximity allows transcription of a reporter gene (e.g., 
20 LacZ) that is operably linked to a transcriptional regulatory site responsive to the transcription 

factor. Expression of the reporter gene can be detected and cell colonies containing the functional 

transcription factor can be isolated and used to obtain the cloned gene that encodes the protein 

which interacts with NOVX. 

The invention further pertains to novel agents identified by the aforementioned screening 
25 assays and uses thereof for treatments as described herein. 

Detection Assays 

Portions or fragments of the cDNA sequences identified herein (and the corresponding 
complete gene sequences) can be used in numerous ways as polynucleotide reagents. By way of 
example, and not of limitation, these sequences can be used to: (z) map their respective genes on a 
30 chromosome; and, thus, locate gene regions associated with genetic disease; (n) identify an 
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individual from a minute biological sample (tissue typing); and (Hi) aid in forensic identification 
of a biological sample. Some of these applications are described in the subsections, below. 

Chromosome Mapping 

Once the sequence (or a portion of the sequence) of a gene has been isolated, this sequence 
can be used to map the location of the gene on a chromosome. This process is called chromosome 
mapping. Accordingly, portions or fragments of the NOVX sequences, SEQ ID NOS:l, 3, 5, 7, 9, 
1 1, 13, 15, 17, 19, 21, 23, 25, 27 and 29, or fragments or derivatives thereof, can be used to map 
the location of the NOVX genes, respectively, on a chromosome. The mapping of the NOVX 
sequences to chromosomes is an important first step in correlating these sequences with genes 
associated with disease. 

Briefly, NOVX genes can be mapped to chromosomes by preparing PCR primers 
(preferably 15-25 bp in length) from the NOVX sequences. Computer analysis of the NOVX, 
sequences can be used to rapidly select primers that do not span more than one exon in the 
genomic DNA, thus complicating the amplification process. These primers can then be used for 
PCR screening of somatic cell hybrids containing individual human chromosomes. Only those 
hybrids containing the human gene corresponding to the NOVX sequences will yield an amplified 
fragment. 

Somatic cell hybrids are prepared by fusing somatic cells from different mammals (e.g., 
human and mouse cells). As hybrids of human and mouse cells grow and divide, they gradually 
lose human chromosomes in random order, but retain the mouse chromosomes. By using media 
in which mouse cells cannot grow, because they lack a particular enzyme, but in which human 
cells can, the one human chromosome that contains the gene encoding the needed enzyme will be 
retained. By using various media, panels of hybrid cell lines can be established. Each cell line in 
a panel contains either a single human chromosome or a small number of human chromosomes, 
and a full set of mouse chromosomes, allowing easy mapping of individual genes to specific 
human chromosomes. See, e.g., D'Eustachio, et aL, 1983. Science 220: 919-924. Somatic cell 
hybrids containing only fragments of human chromosomes can also be produced by using human 
chromosomes with translocations and deletions. 

PCR mapping of somatic cell hybrids is a rapid procedure for assigning a particular 
sequence to a particular chromosome. Three or more sequences can be assigned per day using a 
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single thermal cycler. Using the NOVX sequences to design oligonucleotide primers, sub- 
localization can be achieved with panels of fragments from specific chromosomes. 

Fluorescence in situ hybridization (FISH) of a DNA sequence to a metaphase 
chromosomal spread can further be used to provide a precise chromosomal location in one step. 
Chromosome spreads can be made using cells whose division has been blocked in metaphase by a 
chemical like colcemid that disrupts the mitotic spindle. The chromosomes can be treated briefly 
with trypsin, and then stained with Giemsa. A pattern of light and dark bands develops on each 
chromosome, so that the chromosomes can be identified individually. The FISH technique can be 
used with a DNA sequence as short as 500 or 600 bases. However, clones larger than 1,000 bases 
have a higher likelihood of binding to a unique chromosomal location with sufficient signal 
intensity for simple detection. Preferably 1,000 bases, and more preferably 2,000 bases, will 
suffice to get good results at a reasonable amount of time. For a review of this technique, see, 
Verma, et aL 9 Human Chromosomes: A Manual of Basic Techniques (Pergamon Press, New 
York 1988). 

Reagents for chromosome mapping can be used individually to mark a single chromosome 
or a single site on that chromosome, or panels of reagents can be used for marking multiple sites 
and/or multiple chromosomes. Reagents corresponding to noncoding regions of the genes 
actually are preferred for mapping purposes. Coding sequences are more likely to be conserved 
within gene families, thus increasing the chance of cross hybridizations during chromosomal 
mapping. 

Once a sequence has been mapped to a precise chromosomal location, the physical 
position of the sequence on the chromosome can be correlated with genetic map data. Such data 
are found, e.g., in McKusick, Mendelian Inheritance in Man, available on-line through Johns 
Hopkins University Welch Medical Library). The relationship between genes and disease, 
mapped to the same chromosomal region, can then be identified through linkage analysis 
(co-inheritance of physically adjacent genes), described in, e.g., Egeland, et al, 1987. Nature, 
325: 783-787. 

Moreover, differences in the DNA sequences between individuals affected and unaffected 
with a disease associated with the NOVX gene, can be determined. If a mutation is observed in 
some or all of the affected individuals but not in any unaffected individuals, then the mutation is 
likely to be the causative agent of the particular disease. Comparison of affected and unaffected 
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individuals generally involves first looking for structural alterations in the chromosomes, such as 
deletions or translocations that are visible from chromosome spreads or detectable using PCR 
based on that DNA sequence. Ultimately, complete sequencing of genes from several individuals 
can be performed to confirm the presence of a mutation and to distinguish mutations from 
5 polymorphisms. 

Tissue Typing 

The NOVX sequences of the invention can also be used to identify individuals from 
minute biological samples. In this technique, an individual's genomic DNA is digested with one 
& or more restriction enzymes, and probed on a Southern blot to yield unique bands for 
Si identification. The sequences of the invention are useful as additional DNA markers for RFLP 
J! ) ("restriction fragment length polymorphisms/ 5 described in U.S. Patent No. 5,272,057). 

Cn Furthermore, the sequences of the invention can be used to provide an alternative 

J:: 

m technique that determines the actual base-by-base DNA sequence of selected portions of an 
L individual's genome. Thus, the NOVX sequences described herein can be used to prepare two 
W5 PCR primers from the 5'- and 3'-termini of the sequences. These primers can then be used to 
m amplify an individual's DNA and subsequently sequence it. 

O Panels of corresponding DNA sequences from individuals, prepared in this manner, can 

provide unique individual identifications, as each individual will have a unique set of such DNA 
sequences due to allelic differences. The sequences of the invention can be used to obtain such 

20 identification sequences from individuals and from tissue. The NOVX sequences of the invention 
uniquely represent portions of the human genome. Allelic variation occurs to some degree in the 
coding regions of these sequences, and to a greater degree in the noncoding regions. It is 
estimated that allelic variation between individual humans occurs with a frequency of about once 
per each 500 bases. Much of the allelic variation is due to single nucleotide polymorphisms 

25 (SNPs), which include restriction fragment length polymorphisms (RFLPs). 

Each of the sequences described herein can, to some degree, be used as a standard against 
which DNA from an individual can be compared for identification purposes. Because greater 
numbers of polymorphisms occur in the noncoding regions, fewer sequences are necessary to 
differentiate individuals. The noncoding sequences can comfortably provide positive individual 

30 identification with a panel of perhaps 1 0 to 1 ,000 primers that each yield a noncoding amplified 
sequence of 100 bases. If predicted coding sequences, such as those in SEQ ID NOS:l, 3, 5, 7, 9, 
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11, 13, 15, 17, 19, 21, 23, 25, 27 and 29are used, a more appropriate number of primers for 
positive individual identification would be 500-2,000. 

Predictive Medicine 

The invention also pertains to the field of predictive medicine in which diagnostic assays, 
prognostic assays, pharmacogenomics, and monitoring clinical trials are used for prognostic 
(predictive) purposes to thereby treat an individual prophylactically. Accordingly, one aspect of 
the invention relates to diagnostic assays for determining NOVX protein and/or nucleic acid 
expression as well as NOVX activity, in the context of a biological sample (e.g., blood, serum, 
cells, tissue) to thereby determine whether an individual is afflicted with a disease or disorder, or 
is at risk of developing a disorder, associated with aberrant NOVX expression or activity. The 
disorders include metabolic disorders, diabetes, obesity, infectious disease, anorexia, cancer- 
associated cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's 
Disorder, immune disorders, and hematopoietic disorders, and the various dyslipidemias, 
metabolic disturbances associated with obesity, the metabolic syndrome X and wasting disorders 
associated with chronic diseases and various cancers. The invention also provides for prognostic 
(or predictive) assays for determining whether an individual is at risk of developing a disorder 
associated with NOVX protein, nucleic acid expression or activity. For example, mutations in an 
NOVX gene can be assayed in a biological sample. Such assays can be used for prognostic or 
predictive purpose to thereby prophylactically treat an individual prior to the onset of a disorder 
characterized by or associated with NOVX protein, nucleic acid expression, or biological activity. 

Another aspect of the invention provides methods for determining NOVX protein, nucleic 
acid expression or activity in an individual to thereby select appropriate therapeutic or 
prophylactic agents for that individual (referred to herein as "phantnacogenomics"). 
Pharmacogenomics allows for the selection of agents (e.g., drugs) for therapeutic or prophylactic 
treatment of an individual based on the genotype of the individual (e.g., the genotype of the 
individual examined to determine the ability of the individual to respond to a particular agent.) 

Yet another aspect of the invention pertains to monitoring the influence of agents (e.g., 
drugs, compounds) on the expression or activity of NOVX in clinical trials. 

These and other agents are described in further detail in the following sections. 
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Diagnostic Assays 

An exemplary method for detecting the presence or absence of NOVX in a biological 
sample involves obtaining a biological sample from a test subject and contacting the biological 
sample with a compound or an agent capable of detecting NOVX protein or nucleic acid (e.g., 
5 mRNA, genomic DNA) that encodes NOVX protein such that the presence of NOVX is detected 
in the biological sample. An agent for detecting NOVX mRNA or genomic DNA is a labeled 
nucleic acid probe capable of hybridizing to NOVX mRNA or genomic DNA. The nucleic acid 
probe can be, for example, a full-length NOVX nucleic acid, such as the nucleic acid of SEQ ID 
NOS:l,3,5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27 and 29, or a portion thereof, such as an 

A3) oligonucleotide of at least 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient to 

D 

f?f specifically hybridize under stringent conditions to NOVX mRNA or genomic DNA. Other 

iTi "t. 

l lz suitable probes for use in the diagnostic assays of the invention are described herein. 
=p An agent for detecting NOVX protein is an antibody capable of binding to NOVX protein, 

preferably an antibody with a detectable label. Antibodies can be polyclonal, or more preferably, 
*K monoclonal. An intact antibody, or a fragment thereof (e.g., Fab or F(ab')2) can be used. The 
M term "labeled", with regard to the probe or antibody, is intended to encompass direct labeling of 
f % the probe or antibody by coupling (i.e. , physically linking) a detectable substance to the probe or 
W antibody, as well as indirect labeling of the probe or antibody by reactivity with another reagent 

that is directly labeled. Examples of indirect labeling include detection of a primary antibody 
20 using a fluorescently-labeled secondary antibody and end-labeling of a DNA probe with biotin 

such that it can be detected with fluorescently-labeled streptavidin. The term "biological sample" 
is intended to include tissues, cells and biological fluids isolated from a subject, as well as tissues, 
cells and fluids present within a subject. That is, the detection method of the invention can be 
used to detect NOVX mRNA, protein, or genomic DNA in a biological sample in vitro as well as 
25 in vivo. For example, in vitro techniques for detection of NOVX mRNA include Northern 
hybridizations and in situ hybridizations. In vitro techniques for detection of NOVX protein 
include enzyme linked immunosorbent assays (ELISAs), Western blots, immunoprecipitations, 
and immunofluorescence. In vitro techniques for detection of NOVX genomic DNA include 
Southern hybridizations. Furthermore, in vivo techniques for detection of NOVX protein include 
30 introducing into a subject a labeled anti-NOVX antibody. For example, the antibody can be 
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labeled with a radioactive marker whose presence and location in a subject can be detected by 
standard imaging techniques. 

In one embodiment, the biological sample contains protein molecules from the test 
subject. Alternatively, the biological sample can contain mRNA molecules from the test subject or 
genomic DNA molecules from the test subject. A preferred biological sample is a peripheral 
blood leukocyte sample isolated by conventional means from a subject. 

In another embodiment, the methods further involve obtaining a control biological sample 
from a control subject, contacting the control sample with a compound or agent capable of 
detecting NOVX protein, mRNA, or genomic DNA, such that the presence of NOVX protein, 
mRNA or genomic DNA is detected in the biological sample, and comparing the presence of 
NOVX protein, mRNA or genomic DNA in the control sample with the presence of NOVX 
protein, mRNA or genomic DNA in the test sample. 

The invention also encompasses kits for detecting the presence of NOVX in a biological 
sample. For example, the kit can comprise: a labeled compound or agent capable of detecting 
NOVX protein or mRNA in a biological sample; means for determining the amount of NOVX in 
the sample; and means for comparing the amount of NOVX in the sample with a standard. The 
compound or agent can be packaged in a suitable container. The kit can further comprise 
instructions for using the kit to detect NOVX protein or nucleic acid. 

Prognostic Assays 

The diagnostic methods described herein can furthermore be utilized to identify subjects 
having or at risk of developing a disease or disorder associated with aberrant NOVX expression or 
activity. For example, the assays described herein, such as the preceding diagnostic assays or the 
following assays, can be utilized to identify a subject having or at risk of developing a disorder 
associated with NOVX protein, nucleic acid expression or activity. Alternatively, the prognostic 
assays can be utilized to identify a subject having or at risk for developing a disease or disorder. 
Thus, the invention provides a method for identifying a disease or disorder associated with 
aberrant NOVX expression or activity in which a test sample is obtained from a subject and 
NOVX protein or nucleic acid (e.g., mRNA, genomic DNA) is detected, wherein the presence of 
NOVX protein or nucleic acid is diagnostic for a subject having or at risk of developing a disease 
or disorder associated with aberrant NOVX expression or activity. As used herein, a "test 
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sample" refers to a biological sample obtained from a subject of interest. For example, a test 
sample can be a biological fluid (e.g., serum), cell sample, or tissue. 

Furthermore, the prognostic assays described herein can be used to determine whether a 
subject can be administered an agent (e.g., an agonist, antagonist, peptidomimetic, protein, 
peptide, nucleic acid, small molecule, or other drug candidate) to treat a disease or disorder 
associated with aberrant NOVX expression or activity. For example, such methods can be used to 
determine whether a subject can be effectively treated with an agent for a disorder. Thus, the 
invention provides methods for determining whether a subject can be effectively treated with an 
agent for a disorder associated with aberrant NOVX expression or activity in which a test sample 
is obtained and NOVX protein or nucleic acid is detected (e.g., wherein the presence of NOVX 
protein or nucleic acid is diagnostic for a subject that can be administered the agent to treat a 
disorder associated with aberrant NOVX expression or activity). 

The methods of the invention can also be used to detect genetic lesions in an NOVX gene, 
thereby determining if a subject with the lesioned gene is at risk for a disorder characterized by 
aberrant cell proliferation and/or differentiation. In various embodiments, the methods include 
detecting, in a sample of cells from the subject, the presence or absence of a genetic lesion 
characterized by at least one of an alteration affecting the integrity of a gene encoding an 
NOVX-protein, or the misexpression of the NOVX gene. For example, such genetic lesions can 
be detected by ascertaining the existence of at least one of: (i) a deletion of one or more 
nucleotides from an NOVX gene; (z'z) an addition of one or more nucleotides to an NOVX gene; 
(Hi) a substitution of one or more nucleotides of an NOVX gene, (z'v) a chromosomal 
rearrangement of an NOVX gene; (v) an alteration in the level of a messenger RNA transcript of 
an NOVX gene, (vz) aberrant modification of an NOVX gene, such as of the methylation pattern 
of the genomic DNA, (vz'z) the presence of a non- wild-type splicing pattern of a messenger RNA 
transcript of an NOVX gene, (viii) a non-wild-type level of an NOVX protein, (ix) allelic loss of 
an NOVX gene, and (x) inappropriate post-translational modification of an NOVX protein. As 
described herein, there are a large number of assay techniques known in the art which can be used 
for detecting lesions in an NOVX gene. A preferred biological sample is a peripheral blood 
leukocyte sample isolated by conventional means from a subject. However, any biological 
sample containing nucleated cells may be used, including, for example, buccal mucosal cells. 
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In certain embodiments, detection of the lesion involves the use of a probe/primer in a 
polymerase chain reaction (PGR) (see, e.g., U.S. Patent Nos. 4,683,195 and 4,683,202), such as 
anchor PCR or RACE PCR, or, alternatively, in a ligation chain reaction (LCR) (see, e.g., 
Landegran, et al, 1988. Science 241: 1077-1080; and Nakazawa, et al, 1994. Proc. Natl. Acad. 
Sci. USA 91: 360-364), the latter of which can be particularly useful for detecting point mutations 
in the NOVX-gene (see, Abravaya, et al, 1995. Nucl. Acids Res. 23: 675-682). This method can 
include the steps of collecting a sample of cells from a patient, isolating nucleic acid (e.g., 
genomic, mRNA or both) from the cells of the sample, contacting the nucleic acid sample with 
one or more primers that specifically hybridize to an NOVX gene under conditions such that 
hybridization and amplification of the NOVX gene (if present) occurs, and detecting the presence 
or absence of an amplification product, or detecting the size of the amplification product and 
comparing the length to a control sample. It is anticipated that PCR and/or LCR may be desirable 
to use as a preliminary amplification step in conjunction with any of the techniques used for 
detecting mutations described herein. 

Alternative amplification methods include: self sustained sequence replication (see, 
Guatelli, etal, 1990. Proc. Natl. Acad. Sci. USA 87: 1874-1878), transcriptional amplification 
system (see, Kwoh, et al, 1989. Proc. Natl. Acad. Sci. USA 86: 1 173-1 177); Q(3 Replicase (see, 
Lizardi, et al, 1988. BioTechnology 6: 1 197), or any other nucleic acid amplification method, 
followed by the detection of the amplified molecules using techniques well known to those of 
skill in the art. These detection schemes are especially useful for the detection of nucleic acid 
molecules if such molecules are present in very low numbers. 

In an alternative embodiment, mutations in an NOVX gene from a sample cell can be 
identified by alterations in restriction enzyme cleavage patterns. For example, sample and control 
DNA is isolated, amplified (optionally), digested with one or more restriction endonucleases, and 
fragment length sizes are determined by gel electrophoresis and compared. Differences in 
fragment length sizes between sample and control DNA indicates mutations in the sample DNA. 
Moreover, the use of sequence specific ribozymes (see, e.g., U.S. Patent No. 5,493,531) can be 
used to score for the presence of specific mutations by development or loss of a ribozyme 
cleavage site. 

In other embodiments, genetic mutations in NOVX can be identified by hybridizing a 
sample and control nucleic acids, e.g., DNA or RNA, to high-density arrays containing hundreds 
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or thousands of oligonucleotides probes. See, e.g., Cronin, et al., 1996. Human Mutation 7: 
244-255; Kozal, et aL, 1996. Nat. Med. 2: 753-759. For example, genetic mutations in NOVX 
can be identified in two dimensional arrays containing light- generated DNA probes as described 
in Cronin, et aL, supra. Briefly, a first hybridization array of probes can be used to scan through 
5 long stretches of DNA in a sample and control to identify base changes between the sequences by 
making linear arrays of sequential overlapping probes. This step allows the identification of point 
mutations. This is followed by a second hybridization array that allows the characterization of 
specific mutations by using smaller, specialized probe arrays complementary to all variants or 
mutations detected. Each mutation array is composed of parallel probe sets, one complementary 

ftp to the wild-type gene and the other complementary to the mutant gene. 

In yet another embodiment, any of a variety of sequencing reactions known in the art can 

fU be used to directly sequence the NOVX gene and detect mutations by comparing the sequence of 

j» the sample NOVX with the corresponding wild-type (control) sequence. Examples of sequencing 
reactions include those based on techniques developed by Maxim and Gilbert, 1977. Proc. NatL 

05 Acad. Set USA 74: 560 or Sanger, 1977. Proc. Natl. Acad. Set USA 74: 5463. It is also 

y contemplated that any of a variety of automated sequencing procedures can be utilized when 

performing the diagnostic assays {see, e.g., Naeve, et aL, 1995. Biotechniques 19: 448), including 

fjj sequencing by mass spectrometry (see, e.g., PCT International Publication No. WO 94/16101; 

Cohen, et aL, 1996. Adv. Chromatography 36: 127-162; and Griffin, et aL, 1993. AppL Biochem. 
20 BiotechnoL 38: 147-159). 

Other methods for detecting mutations in the NOVX gene include methods in which 
protection from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA 
heteroduplexes. See, e.g., Myers, et aL, 1985. Science 230: 1242. In general, the art technique of 
"mismatch cleavage" starts by providing heteroduplexes of formed by hybridizing (labeled) RNA 
25 or DNA containing the wild-type NOVX sequence with potentially mutant RNA or DNA 
obtained from a tissue sample. The double-stranded duplexes are treated with an agent that 
cleaves single-stranded regions of the duplex such as which will exist due to basepair mismatches 
between the control and sample strands. For instance, RNA/DNA duplexes can be treated with 
RNase and DNA/DNA hybrids treated with Si nuclease to enzymatically digesting the 

30 mismatched regions. In other embodiments, either DNA/DNA or RNA/DNA duplexes can be 
treated with hydroxylamine or osmium tetroxide and with piperidine in order to digest 
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mismatched regions. After digestion of the mismatched regions, the resulting material is then 
separated by size on denaturing polyacrylamide gels to determine the site of mutation. See, e.g., 
Cotton, et al, 1988. Proc. Natl. Acad. Sci. USA 85: 4397; Saleeba, et al, 1992. Methods Enzymol. 
217: 286-295. In an embodiment, the control DNA or RNA can be labeled for detection. 

In still another embodiment, the mismatch cleavage reaction employs one or more proteins 
that recognize mismatched base pairs in double-stranded DNA (so called "DNA mismatch repair" 
enzymes) in defined systems for detecting and mapping point mutations in NOVX cDNAs 
obtained from samples of cells. For example, the mutY enzyme of E. coli cleaves A at G/A 
mismatches and the thymidine DNA glycosylase from HeLa cells cleaves T at G/T mismatches. 
See, e.g., Hsu, era/., 1994. Carcinogenesis 15: 1657-1662. According to an exemplary 
embodiment, a probe based on an NOVX sequence, e.g., a wild-type NOVX sequence, is 
hybridized to a cDNA or other DNA product from a test cell(s). The duplex is treated with a 
DNA mismatch repair enzyme, and the cleavage products, if any, can be detected from 
electrophoresis protocols or the like. See, e.g., U.S. Patent No. 5,459,039. 

In other embodiments, alterations in electrophoretic mobility will be used to identify 
mutations in NOVX genes. For example, single strand conformation polymorphism (SSCP) may 
be used to detect differences in electrophoretic mobility between mutant and wild type nucleic 
acids. See, e.g., Orita, et al, 1989. Proc. Natl. Acad. Sci. USA: 86: 2766; Cotton, 1993. Mutat. 
Res. 285: 125-144; Hayashi, 1992. Genet. Anal. Tech. Appl. 9: 73-79. Single-stranded DNA 
fragments of sample and control NOVX nucleic acids will be denatured and allowed to renature. 
The secondary structure of single-stranded nucleic acids varies according to sequence, the 
resulting alteration in electrophoretic mobility enables the detection of even a single base change. 
The DNA fragments may be labeled or detected with labeled probes. The sensitivity of the assay 
may be enhanced by using RNA (rather than DNA), in which the secondary structure is more 
sensitive to a change in sequence. In one embodiment, the subject method utilizes heteroduplex 
analysis to separate double stranded heteroduplex molecules on the basis of changes in 
electrophoretic mobility. See, e.g., Keen, etal, 1991. Trends Genet. 7: 5. 

In yet another embodiment, the movement of mutant or wild-type fragments in 
polyacrylamide gels containing a gradient of denaturant is assayed using denaturing gradient gel 
electrophoresis (DGGE). See, e.g., Myers, et al, 1985. Nature 313: 495. When DGGE is used as 
the method of analysis, DNA will be modified to insure that it does not completely denature, for 
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example by adding a GC clamp of approximately 40 bp of high-melting GC-rich DNA by PCR. 
In a further embodiment, a temperature gradient is used in place of a denaturing gradient to 
identify differences in the mobility of control and sample DNA. See, e.g., Rosenbaum and 
Reissner, 1987. Biophys. Chem. 265: 12753. 
5 Examples of other techniques for detecting point mutations include, but are not limited to, 

selective oligonucleotide hybridization, selective amplification, or selective primer extension. For 
example, oligonucleotide primers may be prepared in which the known mutation is placed 
centrally and then hybridized to target DNA under conditions that permit hybridization only if a 
perfect match is found. See, e.g., Saiki, et aL, 1986. Nature 324: 163; Saiki, et al. y 1989. Proc. 
•JO Natl Acad. Sci USA 86: 6230. Such allele specific oligonucleotides are hybridized to PCR 
O amplified target DNA or a number of different mutations when the oligonucleotides are attached 
~ I to the hybridizing membrane and hybridized with labeled target DNA. 
D] Alternatively, allele specific amplification technology that depends on selective PCR 

03 amplification may be used in conjunction with the instant invention. Oligonucleotides used as 
1X5 primers for specific amplification may carry the mutation of interest in the center of the molecule 
H = (so that amplification depends on differential hybridization; see, e.g., Gibbs, et al, 1989. Nucl 

SL..S, 
_ 

gn Acids Res. 17: 2437-2448) or at the extreme 3 -terminus of one primer where, under appropriate 
r\ conditions, mismatch can prevent, or reduce polymerase extension (see, e.g., Prossner, 1993. 

Tibtech. 1 1 : 238). In addition it may be desirable to introduce a novel restriction site in the region 

20 of the mutation to create cleavage-based detection. See, e.g., Gasparini, et al, 1992. Mol Cell 
Probes 6: 1 . It is anticipated that in certain embodiments amplification may also be performed 
using Taq ligase for amplification. See, e.g., Barany, 1991. Proc. Natl Acad. Set USA 88: 189. 
In such cases, ligation will occur only if there is a perfect match at the 3'-terminus of the 5' 
sequence, making it possible to detect the presence of a known mutation at a specific site by 

25 looking for the presence or absence of amplification. 

The methods described herein may be performed, for example, by utilizing pre-packaged 
diagnostic kits comprising at least one probe nucleic acid or antibody reagent described herein, 
which may be conveniently used, e.g. , in clinical settings to diagnose patients exhibiting 
symptoms or family history of a disease or illness involving an NOVX gene. 

30 Furthermore, any cell type or tissue, preferably peripheral blood leukocytes, in which 

NOVX is expressed may be utilized in the prognostic assays described herein. However, any 
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biological sample containing nucleated cells may be used, including, for example, buccal mucosal 
cells. 

Pharmacogenomics 

Agents, or modulators that have a stimulatory or inhibitory effect on NOVX activity {e.g., 
NOVX gene expression), as identified by a screening assay described herein can be administered 
to individuals to treat (prophylactically or therapeutically) disorders (The disorders include 
metabolic disorders, diabetes, obesity, infectious disease, anorexia, cancer-associated cachexia, 
cancer, neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, immune 
disorders, and hematopoietic disorders, and the various dyslipidemias, metabolic disturbances 
associated with obesity, the metabolic syndrome X and wasting disorders associated with chronic 
diseases and various cancers.) In conjunction with such treatment, the pharmacogenomics {i.e., 
the study of the relationship between an individual's genotype and that individual's response to a 
foreign compound or drug) of the individual may be considered. Differences in metabolism of 
therapeutics can lead to severe toxicity or therapeutic failure by altering the relation between dose 
and blood concentration of the pharmacologically active drug. Thus, the pharmacogenomics of 
the individual permits the selection of effective agents (e.g., drugs) for prophylactic or therapeutic 
treatments based on a consideration of the individual's genotype. Such pharmacogenomics can 
further be used to determine appropriate dosages and therapeutic regimens. Accordingly, the 
activity of NOVX protein, expression of NOVX nucleic acid, or mutation content of NOVX 
genes in an individual can be determined to thereby select appropriate agent(s) for therapeutic or 
prophylactic treatment of the individual. 

Pharmacogenomics deals with clinically significant hereditary variations in the response to 
drugs due to altered drug disposition and abnormal action in affected persons. See e.g., 
Eichelbaum, 1996. Clin. Exp. Pharmacol Physiol, 23: 983-985; Linder, 1997. Clin. Chem. 9 43: 
254-266. In general, two types of pharmacogenetic conditions can be differentiated. Genetic 
conditions transmitted as a single factor altering the way drugs act on the body (altered drug 
action) or genetic conditions transmitted as single factors altering the way the body acts on drugs 
(altered drug metabolism). These pharmacogenetic conditions can occur either as rare defects or 
as polymorphisms. For example, glucose-6-phosphate dehydrogenase (G6PD) deficiency is a 
common inherited enzymopathy in which the main clinical complication is hemolysis after 
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ingestion of oxidant drugs (anti-malarials, sulfonamides, analgesics, nitrofurans) and consumption 
of fava beans. 

As an illustrative embodiment, the activity of drug metabolizing enzymes is a major 
determinant of both the intensity and duration of drug action. The discovery of genetic 
5 polymorphisms of drug metabolizing enzymes (e.g., N-acetyltransferase 2 (NAT 2) and 

cytochrome P450 enzymes CYP2D6 and CYP2C19) has provided an explanation as to why some 
patients do not obtain the expected drug effects or show exaggerated drug response and serious 
toxicity after taking the standard and safe dose of a drug. These polymorphisms are expressed in 
two phenotypes in the population, the extensive metabolizer (EM) and poor metabolizer (PM). 
JXO The prevalence of PM is different among different populations. For example, the gene coding for 
CYP2D6 is highly polymorphic and several mutations have been identified in PM, which all lead 
J] to the absence of functional CYP2D6. Poor metabolizers of CYP2D6 and CYP2C19 quite 
lJ frequently experience exaggerated drug response and side effects when they receive standard 
03 doses. If a metabolite is the active therapeutic moiety, PM show no therapeutic response, as 
rt5 demonstrated for the analgesic effect of codeine mediated by its CYP2D6-formed metabolite 

morphine. At the other extreme are the so called ultra-rapid metabolizers who do not respond to 
03 standard doses. Recently, the molecular basis of ultra-rapid metabolism has been identified to be 
ffi due to CYP2D6 gene amplification. 

Thus, the activity of NOVX protein, expression of NOVX nucleic acid, or mutation 
20 content of NOVX genes in an individual can be determined to thereby select appropriate agent(s) 
for therapeutic or prophylactic treatment of the individual In addition, pharmacogenetic studies 
can be used to apply genotyping of polymorphic alleles encoding drug-metabolizing enzymes to 
the identification of an individual's drug responsiveness phenotype. This knowledge, when 
applied to dosing or drug selection, can avoid adverse reactions or therapeutic failure and thus 
25 enhance therapeutic or prophylactic efficiency when treating a subject with an NOVX modulator, 
such as a modulator identified by one of the exemplary screening assays described herein. 

Monitoring of Effects During Clinical Trials 

Monitoring the influence of agents (e.g., drugs, compounds) on the expression or activity 
of NOVX (e.g., the ability to modulate aberrant cell proliferation and/or differentiation) can be 
30 applied not only in basic drug screening, but also in clinical trials. For example, the effectiveness 
of an agent determined by a screening assay as described herein to increase NOVX gene 
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expression, protein levels, or upregulate NOVX activity, can be monitored in clinical trails of 
subjects exhibiting decreased NOVX gene expression, protein levels, or downregulated NOVX 
activity. Alternatively, the effectiveness of an agent determined by a screening assay to decrease 
NOVX gene expression, protein levels, or downregulate NOVX activity, can be monitored in 
clinical trails of subjects exhibiting increased NOVX gene expression, protein levels, or 
upregulated NOVX activity. In such clinical trials, the expression or activity of NOVX and, 
preferably, other genes that have been implicated in, for example, a cellular proliferation or 
immune disorder can be used as a "read out" or markers of the immune responsiveness of a 
particular cell. 

By way of example, and not of limitation, genes, including NOVX, that are modulated in 
cells by treatment with an agent (e.g., compound, drug or small molecule) that modulates NOVX 
activity (e.g., identified in a screening assay as described herein) can be identified. Thus, to study 
the effect of agents on cellular proliferation disorders, for example, in a clinical trial, cells can be 
isolated and RNA prepared and analyzed for the levels of expression of NOVX and other genes 
implicated in the disorder. The levels of gene expression (i.e., a gene expression pattern) can be 
quantified by Northern blot analysis or RT-PCR, as described herein, or alternatively by 
measuring the amount of protein produced, by one of the methods as described herein, or by 
measuring the levels of activity of NOVX or other genes. In this manner, the gene expression 
pattern can serve as a marker, indicative of the physiological response of the cells to the agent. 
Accordingly, this response state may be determined before, and at various points during, 
treatment of the individual with the agent. 

In one embodiment, the invention provides a method for monitoring the effectiveness of 
treatment of a subject with an agent (e.g., an agonist, antagonist, protein, peptide, peptidomimetic, 
nucleic acid, small molecule, or other drug candidate identified by the screening assays described 
herein) comprising the steps of (0 obtaining a pre-administration sample from a subject prior to 
administration of the agent; (it) detecting the level of expression of an NOVX protein, mRNA, or 
genomic DNA in the preadministration sample; (in) obtaining one or more post-administration 
samples from the subject; (iv) detecting the level of expression or activity of the NOVX protein, 
mRNA, or genomic DNA in the post-administration samples; (v) comparing the level of 
expression or activity of the NOVX protein, mRNA, or genomic DNA in the pre-administration 
sample with the NOVX protein, mRNA, or genomic DNA in the post administration sample or 
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samples; and (v/) altering the administration of the agent to the subject accordingly. For example, 
increased administration of the agent may be desirable to increase the expression or activity of 
NOVX to higher levels than detected, i.e., to increase the effectiveness of the agent. 
Alternatively, decreased administration of the agent may be desirable to decrease expression or 
5 activity of NOVX to lower levels than detected, i.e., to decrease the effectiveness of the agent. 

Methods of Treatment 

The invention provides for both prophylactic and therapeutic methods of treating a subject 
at risk of (or susceptible to) a disorder or having a disorder associated with aberrant NOVX 
- expression or activity. The disorders include cardiomyopathy, atherosclerosis, hypertension, 
fft) congenital heart defects, aortic stenosis, atrial septal defect (ASD), atrioventricular (A-V) canal 
yl defect, ductus arteriosus, pulmonary stenosis, subaortic stenosis, ventricular septal defect (VSD), 
%i valve diseases, tuberous sclerosis, scleroderma, obesity, transplantation, adrenoleukodystrophy, 
4; congenital adrenal hyperplasia, prostate cancer, neoplasm; adenocarcinoma, lymphoma, uterus 
s cancer, fertility, hemophilia, hypercoagulation, idiopathic thrombocytopenic purpura, 

IT" ^ 

ff 5 immunodeficiencies, graft versus host disease, AIDS, bronchial asthma, Crohn's disease; multiple 
M sclerosis, treatment of Albright Hereditary Ostoeodystrophy, and other diseases, disorders and 
fl conditions of the like. 

v * These methods of treatment will be discussed more fully, below. 

Disease and Disorders 

20 Diseases and disorders that are characterized by increased (relative to a subject not 

suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that antagonize (i.e., reduce or inhibit) activity. Therapeutics that antagonize 
activity may be administered in a therapeutic or prophylactic manner. Therapeutics that may be 
utilized include, but are not limited to: (z) an aforementioned peptide, or analogs, derivatives, 

25 fragments or homologs thereof; (it) antibodies to an aforementioned peptide; {Hi) nucleic acids 
encoding an aforementioned peptide; (iv) administration of antisense nucleic acid and nucleic 
acids that are "dysfunctional" (i.e., due to a heterologous insertion within the coding sequences of 
coding sequences to an aforementioned peptide) that are utilized to "knockout" endogenous 
function of an aforementioned peptide by homologous recombination (see, e.g., Capecchi, 1989. 

30 Science 244: 1288-1292); or (v) modulators ( i.e., inhibitors, agonists and antagonists, including 
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additional peptide mimetic of the invention or antibodies specific to a peptide of the invention) 
that alter the interaction between an aforementioned peptide and its binding partner. 

Diseases and disorders that are characterized by decreased (relative to a subject not 
suffering from the disease or disorder) levels or biological activity may be treated with 
Therapeutics that increase (i.e., are agonists to) activity. Therapeutics that upregulate activity 
may be administered in a therapeutic or prophylactic manner. Therapeutics that may be utilized 
include, but are not limited to, an aforementioned peptide, or analogs, derivatives, fragments or 
homologs thereof; or an agonist that increases bioavailability. 

Increased or decreased levels can be readily detected by quantifying peptide and/or RNA, 
by obtaining a patient tissue sample (e.g., from biopsy tissue) and assaying it in vitro for RNA or 
peptide levels, structure and/or activity of the expressed peptides (or mRNAs of an 
aforementioned peptide). Methods that are well-known within the art include, but are not limited 
to, immunoassays (e.g., by Western blot analysis, immunoprecipitation followed by sodium 
dodecyl sulfate (SDS) polyacrylamide gel electrophoresis, immunocytochemistry, etc.) and/or 
hybridization assays to detect expression of mRNAs (e.g., Northern assays, dot blots, in situ 
hybridization, and the like). 

Prophylactic Methods 

In one aspect, the invention provides a method for preventing, in a subject, a disease or 
condition associated with an aberrant NOVX expression or activity, by administering to the 
subject an agent that modulates NOVX expression or at least one NOVX activity. Subjects at risk 
for a disease that is caused or contributed to by aberrant NOVX expression or activity can be 
identified by, for example, any or a combination of diagnostic or prognostic assays as described 
herein. Administration of a prophylactic agent can occur prior to the manifestation of symptoms 
characteristic of the NOVX aberrancy, such that a disease or disorder is prevented or, 
alternatively, delayed in its progression. Depending upon the type of NOVX aberrancy, for 
example, an NOVX agonist or NOVX antagonist agent can be used for treating the subject. The 
appropriate agent can be determined based on screening assays described herein. The 
prophylactic methods of the invention are further discussed in the following subsections. 

Therapeutic Methods 
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Another aspect of the invention pertains to methods of modulating NOVX expression or 
activity for therapeutic purposes. The modulatory method of the invention involves contacting a 
cell with an agent that modulates one or more of the activities of NOVX protein activity 
associated with the cell. An agent that modulates NOVX protein activity can be an agent as 
5 described herein, such as a nucleic acid or a protein, a naturally-occurring cognate ligand of an 
NOVX protein, a peptide, an NOVX peptidomimetic, or other small molecule. In one 
embodiment, the agent stimulates one or more NOVX protein activity. Examples of such 
stimulatory agents include active NOVX protein and a nucleic acid molecule encoding NOVX 
L . that has been introduced into the cell. In another embodiment, the agent inhibits one or more 
i® NOVX protein activity. Examples of such inhibitory agents include antisense NOVX nucleic acid 
yi molecules and anti-NOVX antibodies. These modulatory methods can be performed in vitro (e.g., 
by culturing the cell with the agent) or, alternatively, in vivo (e.g., by administering the agent to a 
=jp subject). As such, the invention provides methods of treating an individual afflicted with a 
~" disease or disorder characterized by aberrant expression or activity of an NOVX protein or 
5;5 nucleic acid molecule. In one embodiment, the method involves administering an agent (e.g., an 
f=* agent identified by a screening assay described herein), or combination of agents that modulates 
S (e.g., up-regulates or down-regulates) NOVX expression or activity. In another embodiment, the 
= y method involves administering an NOVX protein or nucleic acid molecule as therapy to 

compensate for reduced or aberrant NOVX expression or activity. 
20 Stimulation of NOVX activity is desirable in stations in which NOVX is abnormally 

downregulated and/or in which increased NOVX activity is likely to have a beneficial effect. One 
example of such a situation is where a subject has a disorder characterized by aberrant cell 
proliferation and/or differentiation (e.g., cancer or immune associated disorders). Another 
example of such a situation is where the subject has a gestational disease (e.g., preclampsia). 

25 Determination of the Biological Effect of the Therapeutic 

In various embodiments of the invention, suitable in vitro or in vivo assays are performed 
to determine the effect of a specific Therapeutic and whether its administration is indicated for 
treatment of the affected tissue. 

In various specific embodiments, in vitro assays may be performed with representative 
30 cells of the type(s) involved in the patient's disorder, to determine if a given Therapeutic exerts 
the desired effect upon the cell type(s). Compounds for use in therapy may be tested in suitable 
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animal model systems including, but not limited to rats, mice, chicken, cows, monkeys, rabbits, 
and the like, prior to testing in human subjects. Similarly, for in vivo testing, any of the animal 
model system known in the art may be used prior to administration to human subjects. 

Prophylactic and Therapeutic Uses of the Compositions of the Invention 

The NOVX nucleic acids and proteins of the invention are useful in potential prophylactic 
and therapeutic applications implicated in a variety of disorders including, but not limited to: 
metabolic disorders, diabetes, obesity, infectious disease, anorexia, cancer-associated cancer, 
neurodegenerative disorders, Alzheimer's Disease, Parkinson's Disorder, immune disorders, 
hematopoietic disorders, and the various dyslipidemias, metabolic disturbances associated with 
obesity, the metabolic syndrome X and wasting disorders associated with chronic diseases and 
various cancers. 

As an example, a cDNA encoding the NOVX protein of the invention may be useful in 
gene therapy, and the protein may be useful when administered to a subject in need thereof. By 
way of non-limiting example, the compositions of the invention will have efficacy for treatment 
of patients suffering from: metabolic disorders, diabetes, obesity, infectious disease, anorexia, 
cancer-associated cachexia, cancer, neurodegenerative disorders, Alzheimer's Disease, 
Parkinson's Disorder, immune disorders, hematopoietic disorders, and the various dyslipidemias. 

Both the novel nucleic acid encoding the NOVX protein, and the NOVX protein of the 
invention, or fragments thereof, may also be useful in diagnostic applications, wherein the 
presence or amount of the nucleic acid or the protein are to be assessed. A further use could be as 
an anti -bacterial molecule (i.e., some peptides have been found to possess anti-bacterial 
properties). These materials are further useful in the generation of antibodies, which 
immunospecifically-bind to the novel substances of the invention for use in therapeutic or 
diagnostic methods. 

The invention will be further described in the following examples, which do not limit the 
scope of the invention described in the claims. 

Examples 

Example 1: Identification of NOVX Nucleic Acids 

TblastN using CuraGen Corporation's sequence file for polypeptides or homologs was run 
against the Genomic Daily Files made available by GenBank or from files downloaded from the 
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individual sequencing centers. Exons were predicted by homology and the intron/exon boundaries 
were determined using standard genetic rules. Exons were further selected and refined by means 
of similarity determination using multiple BLAST (for example, tBlastN, BlastX, and BlastN) 
searches, and, in some instances, GeneScan and Grail. Expressed sequences from both public and 
proprietary databases were also added when available to further define and complete the gene 
sequence. The DNA sequence was then manually corrected for apparent inconsistencies thereby 
obtaining the sequences encoding the full-length protein. 

The novel NOVX target sequences identified in the present invention were subjected to 
the ex on linking process to confirm the sequence. PCR primers were designed by starting at the 
most upstream sequence available, for the forward primer, and at the most downstream sequence 
available for the reverse primer. PCR primer sequences were used for obtaining different clones. 
In each case, the sequence was examined, walking inward from the respective termini toward the 
coding sequence, until a suitable sequence that is either unique or highly selective was 
encountered, or, in the case of the reverse primer, until the stop codon was reached. Such primers 
were designed based on in silico predictions for the full length cDNA, part (one or more exons) of 
the DNA or protein sequence of the target sequence, or by translated homology of the predicted 
exons to closely related human sequences from other species. These primers were then employed 
in PCR amplification based on the following pool of human cDNAs: adrenal gland, bone 
marrow, brain - amygdala, brain - cerebellum, brain - hippocampus, brain - substantia nigra, brain 
- thalamus, brain -whole, fetal brain, fetal kidney, fetal liver, fetal lung, heart, kidney, lymphoma - 
Raji, mammary gland, pancreas, pituitary gland, placenta, prostate, salivary gland, skeletal 
muscle, small intestine, spinal cord, spleen, stomach, testis, thyroid, trachea, uterus. Usually the 
resulting amplicons were gel purified, cloned and sequenced to high redundancy. The PCR 
product derived from exon linking was cloned into the pCR2.1 vector from Invitrogen. The 
resulting bacterial clone has an insert covering the entire open reading frame cloned into the 
pCR2.1 vector. The resulting sequences from all clones were assembled with themselves, with 
other fragments in CuraGen Corporation's database and with public ESTs. Fragments and ESTs 
were included as components for an assembly when the extent of their identity with another 
component of the assembly was at least 95% over 50 bp. In addition, sequence traces were 
evaluated manually and edited for corrections if appropriate. These procedures provide the 
sequence reported herein. 
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Physical clone: Exons were predicted by homology and the intron/exon boundaries were 
determined using standard genetic rules. Exons were further selected and refined by means of 
similarity determination using multiple BLAST (for example, tBlastN, BlastX, and BlastN) 
searches, and, in some instances, GeneScan and Grail. Expressed sequences from both public and 
proprietary databases were also added when available to further define and complete the gene 
sequence. The DNA sequence was then manually corrected for apparent inconsistencies thereby 
obtaining the sequences encoding the full-length protein. 

Example 2: Identification of Single Nucleotide Polymorphisms in NOVX nucleic acid 
sequences 

Variant sequences are also included in this application. A variant sequence can include a 
single nucleotide polymorphism (SNP). A SNP can, in some instances, be referred to as a "cSNP" 
to denote that the nucleotide sequence containing the SNP originates as a cDNA. A SNP can arise 
in several ways. For example, a SNP may be due to a substitution of one nucleotide for another at 
the polymorphic site. Such a substitution can be either a transition or a transversion. A SNP can 
also arise from a deletion of a nucleotide or an insertion of a nucleotide, relative to a reference 
allele. In this case, the polymorphic site is a site at which one allele bears a gap with respect to a 
particular nucleotide in another allele. SNPs occurring within genes may result in an alteration of 
the amino acid encoded by the gene at the position of the SNP. Intragenic SNPs may also be 
silent, when a codon including a SNP encodes the same amino acid as a result of the redundancy 
of the genetic code. SNPs occurring outside the region of a gene, or in an intron within a gene, do 
not result in changes in any amino acid sequence of a protein but may result in altered regulation 
of the expression pattern. Examples include alteration in temporal expression, physiological 
response regulation, cell type expression regulation, intensity of expression, and stability of 
transcribed message. 

SeqCalling assemblies produced by the exon linking process were selected and extended 
using the following criteria. Genomic clones having regions with 98% identity to all or part of the 
initial or extended sequence were identified by BLASTN searches using the relevant sequence to 
query human genomic databases. The genomic clones that resulted were selected for further 
analysis because this identity indicates that these clones contain the genomic locus for these 
SeqCalling assemblies. These sequences were analyzed for putative coding regions as well as for 
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similarity to the known DNA and protein sequences. Programs used for these analyses include 
Grail, Genscan, BLAST, HMMER, FASTA, Hybrid and other relevant programs. 

Some additional genomic regions may have also been identified because selected 
SeqCalling assemblies map to those regions. Such SeqCalling sequences may have overlapped 
5 with regions defined by homology or exon prediction. They may also be included because the 
location of the fragment was in the vicinity of genomic regions identified by similarity or exon 
prediction that had been included in the original predicted sequence. The sequence so identified 
was manually assembled and then may have been extended using one or more additional 
sequences taken from CuraGen Corporation's human SeqCalling database. SeqCalling fragments 
£0 suitable for inclusion were identified by the CuraTools™ program SeqExtend or by identifying 
O SeqCalling fragments mapping to the appropriate regions of the genomic clones analyzed, 
f I j The regions defined by the procedures described above were then manually integrated and 

HI corrected for apparent inconsistencies that may have arisen, for example, from miscalled bases in 
03 the original fragments or from discrepancies between predicted exon junctions, EST locations and 
If regions of sequence similarity, to derive the final sequence disclosed herein. When necessary, the 

process to identify and analyze SeqCalling assemblies and genomic clones was reiterated to 
fij derive the full length sequence (Alderborn et ah, Determination of Single Nucleotide 
J/J Polymorphisms by Real-time Pyrophosphate DNA Sequencing. Genome Research. 10 (8) 1249- 
1265,2000). 

20 

Example 3. Quantitative expression analysis of clones in various cells and tissues 

The quantitative expression of various clones was assessed using microtiter plates 
containing RNA samples from a variety of normal and pathology-derived cells, cell lines and 
tissues using real time quantitative PCR (RTQ PGR). RTQ PCR was performed on an Applied 

25 Biosystems ABI PRISM® 7700 or an ABI PRISM® 7900 HT Sequence Detection System. 

Various collections of samples are assembled on the plates, and referred to as Panel 1 (containing 
normal tissues and cancer cell lines), Panel 2 (containing samples derived from tissues from 
normal and cancer sources), Panel 3 (containing cancer cell lines), Panel 4 (containing cells and 
cell lines from normal tissues and cells related to inflammatory conditions), Panel 5D/5I 

30 (containing human tissues and cell lines with an emphasis on metabolic diseases), 

AI_comprehensive_panel (containing normal tissue and samples from autoimmune diseases), 
Panel CNSD.01 (containing central nervous system samples from normal and diseased brains) and 
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CNSneurodegenerationjpanel (containing samples from normal and Alzheimer's diseased 
brains). 

RNA integrity from all samples is controlled for quality by visual assessment of agarose 
gel electropherograms using 28S and 18S ribosomal RNA staining intensity ratio as a guide (2:1 
to 2.5:1 28s: 18s) and the absence of low molecular weight RNAs that would be indicative of 
degradation products. Samples are controlled against genomic DNA contamination by RTQ PCR 
reactions run in the absence of reverse transcriptase using probe and primer sets designed to 
amplify across the span of a single exon. 

First, the RNA samples were normalized to reference nucleic acids such as constitutively 
expressed genes (for example, (3-actin and GAPDH). Normalized RNA (5 ul) was converted to 
cDNA and analyzed by RTQ-PCR using One Step RT-PCR Master Mix Reagents (Applied 
Biosystems; Catalog No. 4309169) and gene-specific primers according to the manufacturer's 
instructions. 

In other cases, non-normalized RNA samples were converted to single strand cDNA 
(sscDNA) using Superscript II (Invitrogen Corporation; Catalog No. 18064-147) and random 
hexamers according to the manufacturer's instructions. Reactions containing up to 10 \xg of total 
RNA were performed in a volume of 20 ^xl and incubated for 60 minutes at 42°C. This reaction 
can be scaled up to 50 jag of total RNA in a final volume of 100 sscDNA samples are then 
normalized to reference nucleic acids as described previously, using IX TaqMan® Universal 
Master mix (Applied Biosystems; catalog No. 4324020), following the manufacturer's 
instructions. 

Probes and primers were designed for each assay according to Applied Biosystems Primer 

Express Software package (version I for Apple Computer's Macintosh Power PC) or a similar 

algorithm using the target sequence as input. Default settings were used for reaction conditions 

and the following parameters were set before selecting primers: primer concentration = 250 nM, 

primer melting temperature (Tm) range = 58°-60°C, primer optimal Tm = 59°C, maximum primer 

difference = 2°C, probe does not have 5'G, probe Tm must be 10°C greater than primer Tm, 

amplicon size 75bp to lOObp. The probes and primers selected (see below) were synthesized by 

Synthegen (Houston, TX, USA). Probes were double purified by HPLC to remove uncoupled dye 

and evaluated by mass spectroscopy to verify coupling of reporter and quencher dyes to the 5 ! and 

173 



3' ends of the probe, respectively. Their final concentrations were: forward and reverse primers, 
900nM each, and probe, 200nM. 

PCR conditions: When working with RNA samples, normalized RNA from each tissue 
and each cell line was spotted in each well of either a 96 well or a 3 84- well PCR plate (Applied 
Biosystems). PCR cocktails included either a single gene specific probe and primers set, or two 
multiplexed probe and primers sets (a set specific for the target clone and another gene-specific 
set multiplexed with the target probe). PCR reactions were set up using TaqMan® One-Step RT- 
PCR Master Mix (Applied Biosystems, Catalog No. 4313803) following manufacturer's 
instructions. Reverse transcription was performed at 48°C for 30 minutes followed by 
amplification/PCR cycles as follows: 95°C 10 min, then 40 cycles of 95°C for 15 seconds, 60°C 
for 1 minute. Results were recorded as CT values (cycle at which a given sample crosses a 
threshold level of fluorescence) using a log scale, with the difference in RNA concentration 
between a given sample and the sample with the lowest CT value being represented as 2 to the 
power of delta CT. The percent relative expression is then obtained by taking the reciprocal of this 
RNA difference and multiplying by 100. 

When working with sscDNA samples, normalized sscDNA was used as described 
previously for RNA samples. PCR reactions containing one or two sets of probe and primers were 
set up as described previously, using IX TaqMan® Universal Master mix (Applied Biosystems; 
catalog No. 4324020), following the manufacturer's instructions. PCR amplification was 
performed as follows: 95°C 10 min, then 40 cycles of 95°C for 15 seconds, 60°C for 1 minute. 
Results were analyzed and processed as described previously. 

Panels 1, 1.1, 1.2, and L3D 

The plates for Panels 1, 1.1, 1.2 and L3D include 2 control wells (genomic DNA control 
and chemistry control) and 94 wells containing cDNA from various samples. The samples in these 
panels are broken into 2 classes: samples derived from cultured cell lines and samples derived 
from primary normal tissues. The cell lines are derived from cancers of the following types: lung 
cancer, breast cancer, melanoma, colon cancer, prostate cancer, CNS cancer, squamous cell 
carcinoma, ovarian cancer, liver cancer, renal cancer, gastric cancer and pancreatic cancer. Cell 
lines used in these panels are widely available through the American Type Culture Collection 
(ATCC), a repository for cultured cell lines, and were cultured using the conditions recommended 
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by the ATCC The normal tissues found on these panels are comprised of samples derived from 
all major organ systems from single adult individuals or fetuses. These samples are derived from 
the following organs: adult skeletal muscle, fetal skeletal muscle, adult heart, fetal heart, adult 
kidney, fetal kidney, adult liver, fetal liver, adult lung, fetal lung, various regions of the brain, the 
5 spleen, bone marrow, lymph node, pancreas, salivary gland, pituitary gland, adrenal gland, spinal 
cord, thymus, stomach, small intestine, colon, bladder, trachea, breast, ovary, uterus, placenta, 
prostate, testis and adipose. 

In the results for Panels 1, 1.1, 1.2 and 1.3D, the following abbreviations are used: 

M» ca. = carcinoma, 

QO * = established from metastasis, 

C3 met = metastasis, 

jf I s cell var = small cell variant, 

J* non-s = non-sm = non-small, 

squam = squamous, 
JBS pi. eff = pi effusion = pleural effusion, 

glio = glioma, 
q astro = astrocytoma, and 

M= neuro = neuroblastoma. 

General_screening_panel_vl.4 

C|0 The plates for Panel 1.4 include 2 control wells (genomic DNA control and chemistry 

§ u 

control) and 94 wells containing cDNA from various samples. The samples in Panel 1 .4 are 
broken into 2 classes: samples derived from cultured cell lines and samples derived from primary 
normal tissues. The cell lines are derived from cancers of the following types: lung cancer, breast 
cancer, melanoma, colon cancer, prostate cancer, CNS cancer, squamous cell carcinoma, ovarian 

25 cancer, liver cancer, renal cancer, gastric cancer and pancreatic cancer. Cell lines used in Panel 
1.4 are widely available through the American Type Culture Collection (ATCC), a repository for 
cultured cell lines, and were cultured using the conditions recommended by the ATCC. The 
normal tissues found on Panel 1.4 are comprised of pools of samples derived from all major organ 
systems from 2 to 5 different adult individuals or fetuses. These samples are derived from the 

30 following organs: adult skeletal muscle, fetal skeletal muscle, adult heart, fetal heart, adult kidney, 
fetal kidney, adult liver, fetal liver, adult lung, fetal lung, various regions of the brain, the spleen, 
bone marrow, lymph node, pancreas, salivary gland, pituitary gland, adrenal gland, spinal cord, 
thymus, stomach, small intestine, colon, bladder, trachea, breast, ovary, uterus, placenta, prostate, 
testis and adipose. Abbreviations are as described for Panels 1, 1.1, 1.2, and 1.3D. 
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Panels 2D and 2.2 

The plates for Panels 2D and 2.2 generally include 2 control wells and 94 test samples 
composed of RNA or cDNA isolated from human tissue procured by surgeons working in close 
cooperation with the National Cancer Institute's Cooperative Human Tissue Network (CHTN) or 
the National Disease Research Initiative (NDRI). The tissues are derived from human 
malignancies and in cases where indicated many malignant tissues have "matched margins" 
obtained from noncancerous tissue just adjacent to the tumor. These are termed normal adjacent 
tissues and are denoted "NAT" in the results below. The tumor tissue and the "matched margins" 
are evaluated by two independent pathologists (the surgical pathologists and again by a 
pathologist at NDRI or CHTN). This analysis provides a gross histopathological assessment of 
tumor differentiation grade. Moreover, most samples include the original surgical pathology 
report that provides information regarding the clinical stage of the patient. These matched margins 
are taken from the tissue surrounding (i.e. immediately proximal) to the zone of surgery 
(designated "NAT", for normal adjacent tissue, in Table RR). In addition, RNA and cDNA 
samples were obtained from various human tissues derived from autopsies performed on elderly 
people or sudden death victims (accidents, etc.). These tissues were ascertained to be free of 
disease and were purchased from various commercial sources such as Clontech (Palo Alto, CA), 
Research Genetics, and Invitrogen. 

Panel 3D 

The plates of Panel 3D are comprised of 94 cDNA samples and two control samples. 
Specifically, 92 of these samples are derived from cultured human cancer cell lines, 2 samples of 
human primary cerebellar tissue and 2 controls. The human cell lines are generally obtained from 
ATCC (American Type Culture Collection), NCI or the German tumor cell bank and fall into the 
following tissue groups: Squamous cell carcinoma of the tongue, breast cancer, prostate cancer, 
melanoma, epidermoid carcinoma, sarcomas, bladder carcinomas, pancreatic cancers, kidney 
cancers, leukemias/lymphomas, ovarian/uterine/cervical, gastric, colon, lung and CNS cancer cell 
lines. In addition, there are two independent samples of cerebellum. These cells are all cultured 
under standard recommended conditions and RNA extracted using the standard procedures. The 
cell lines in panel 3D and 1.3D are of the most common cell lines used in the scientific literature. 

Panels 4D, 4R 5 and 4.1D 
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Panel 4 includes samples on a 96 well plate (2 control wells, 94 test samples) composed of 
RNA (Panel 4R) or cDNA (Panels 4D/4.1D) isolated from various human cell lines or tissues 
related to inflammatory conditions. Total RNA from control normal tissues such as colon and 
lung (Stratagene, La Jolla, CA) and thymus and kidney (Clontech) was employed. Total RNA 
5 from liver tissue from cirrhosis patients and kidney from lupus patients was obtained from 
BioChain (Biochain Institute, Inc., Hayward, CA). Intestinal tissue for RNA preparation from 
patients diagnosed as having Crohn's disease and ulcerative colitis was obtained from the National 
Disease Research Interchange (NDRI) (Philadelphia, PA). 

Astrocytes, lung fibroblasts, dermal fibroblasts, coronary artery smooth muscle cells, 
%b small airway epithelium, bronchial epithelium, microvascular dermal endothelial cells, 
lil microvascular lung endothelial cells, human pulmonary aortic endothelial cells, human umbilical 
f>i vein endothelial cells were all purchased from Clonetics (Walkersville, MD) and grown in the 

media supplied for these cell types by Clonetics. These primary cell types were activated with 
s various cytokines or combinations of cytokines for 6 and/or 12-14 hours, as indicated. The 
Jj5 following cytokines were used; IL-1 beta at approximately l-5ng/ml, TNF alpha at approximately 

5-10ng/ml, IFN gamma at approximately 20-50ng/ml, IL-4 at approximately 5-10ng/ml, IL-9 at 
p approximately 5-10ng/ml, IL-1 3 at approximately 5-10ng/ml. Endothelial cells were sometimes 

starved for various times by culture in the basal media from Clonetics with 0.1% serum. 

Mononuclear cells were prepared from blood of employees at CuraGen Corporation, using 

20 Ficoll. LAK cells were prepared from these cells by culture in DMEM 5% FCS (Hyclone), 

lOOjiiM non essential amino acids (Gibco/Life Technologies, Rockville, MD), ImM sodium 

pyruvate (Gibco), mercaptoethanol 5.5xl0" 5 M (Gibco), and lOmM Hepes (Gibco) and Interleukin 

2 for 4-6 days. Cells were then either activated with 10-20ng/ml PMA and l-2|ig/ml ionomycin, 

IL-1 2 at 5-10ng/ml, IFN gamma at 20-50ng/ml and IL-1 8 at 5-10ng/ml for 6 hours. In some cases, 

25 mononuclear cells were cultured for 4-5 days in DMEM 5% FCS (Hyclone), IOOjiM non essential 

amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl0" 5 M (Gibco), and 

lOmM Hepes (Gibco) with PHA (phytohemagglutinin) or PWM (pokeweed mitogen) at 

approximately 5]Ug/ml. Samples were taken at 24, 48 and 72 hours for RNA preparation. MLR 

(mixed lymphocyte reaction) samples were obtained by taking blood from two donors, isolating 

30 the mononuclear cells using Ficoll and mixing the isolated mononuclear cells 1:1 at a final 

concentration of approximately 2xl0 6 cells/ml in DMEM 5% FCS (Hyclone), 100|iM non 
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essential amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol (5.5x1 0°M) 
(Gibco), and lOmM Hepes (Gibco). The MLR was cultured and samples taken at various time 
points ranging from 1-7 days for RNA preparation. 

Monocytes were isolated from mononuclear cells using CD 14 Miltenyi Beads, +ve VS 
selection columns and a Vario Magnet according to the manufacturer's instructions. Monocytes 
were differentiated into dendritic cells by culture in DMEM 5% fetal calf serum (FCS) (Hyclone, 
Logan, UT), 100|iM non essential amino acids (Gibco), ImM sodium pyruvate (Gibco), 
mercaptoethanol 5.5xlO~ 5 M (Gibco), and lOmM Hepes (Gibco), 50ng/ml GMCSF and 5ng/ml IL- 
4 for 5-7 days. Macrophages were prepared by culture of monocytes for 5-7 days in DMEM 5% 
FCS (Hyclone), lOOfiM non essential amino acids (Gibco), ImM sodium pyruvate (Gibco), 
mercaptoethanol 5.5xl(T 5 M (Gibco), lOmM Hepes (Gibco) and 10% AB Human Serum or MCSF 
at approximately 50ng/ml. Monocytes, macrophages and dendritic cells were stimulated for 6 and 
12-14 hours with lipopolysaccharide (LPS) at lOOng/ml. Dendritic cells were also stimulated with 
anti-CD40 monoclonal antibody (Pharmingen) at 10|j,g/ml for 6 and 12-14 hours. 

CD4 lymphocytes, CD8 lymphocytes and NK cells were also isolated from mononuclear 

cells using CD4, CD8 and CD56 Miltenyi beads, positive VS selection columns and a Vario 

Magnet according to the manufacturer's instructions. CD45RA and CD45RO CD4 lymphocytes 

were isolated by depleting mononuclear cells of CD8, CD56, CD14 and CD19 cells using CD8, 

CD56, CD14 and CD19 Miltenyi beads and positive selection. CD45RO beads were then used to 

isolate the CD45RO CD4 lymphocytes with the remaining cells being CD45RA CD4 

lymphocytes. CD45RA CD4, CD45RO CD4 and CD8 lymphocytes were placed in DMEM 5% 

FCS (Hyclone), lOOjiM non essential amino acids (Gibco), ImM sodium pyruvate (Gibco), 

mercaptoethanol 5.5x1 0" 5 M (Gibco), and lOmM Hepes (Gibco) and plated at 10 6 cells/ml onto 

Falcon 6 well tissue culture plates that had been coated overnight with 0.5|ig/ml anti-CD28 

(Pharmingen) and 3ug/ml anti-CD3 (OKT3, ATCC) in PBS. After 6 and 24 hours, the cells were 

harvested for RNA preparation. To prepare chronically activated CD8 lymphocytes, we activated 

the isolated CD8 lymphocytes for 4 days on anti-CD28 and anti-CD3 coated plates and then 

harvested the cells and expanded them in DMEM 5% FCS (Hyclone), lOOjiiM non essential amino 

acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xl(T 5 M (Gibco), and lOmM 

Hepes (Gibco) and IL-2. The expanded CD8 cells were then activated again with plate bound 

anti-CD3 and anti-CD28 for 4 days and expanded as before. RNA was isolated 6 and 24 hours 
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after the second activation and after 4 days of the second expansion culture. The isolated NK cells 
were cultured in DMEM 5% FCS (Hyclone), lOOuM non essential amino acids (Gibco), ImM 
sodium pyruvate (Gibco), mercaptoethanol 5.5xlO" 5 M (Gibco), and lOmM Hepes (Gibco) and IL- 
2 for 4-6 days before RNA was prepared. 

To obtain B cells, tonsils were procured from NDRI. The tonsil was cut up with sterile 
dissecting scissors and then passed through a sieve. Tonsil cells were then spun down and 
resupended at 10 6 cells/ml in DMEM 5% FCS (Hyclone), lOOuM non essential amino acids 
(Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xlO" 5 M (Gibco), and lOmM Hepes 
(Gibco). To activate the cells, we used PWM at 5 ug/ml or anti-CD40 (Pharmingen) at 
approximately lOpg/ml and IL-4 at 5-10ng/ml. Cells were harvested for RNA preparation at 
24,48 and 72 hours. 

To prepare the primary and secondary Thl/Th2 and Trl cells, six-well Falcon plates were 
coated overnight with lOug/ml anti-CD28 (Pharmingen) and 2u.g/ml OKT3 (ATCC), and then 
washed twice with PBS. Umbilical cord blood CD4 lymphocytes (Poietic Systems, German 
Town, MD) were cultured at 10 5 -10 6 cells/ml in DMEM 5% FCS (Hyclone), lOOuM non essential 
amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5x1 0" 5 M (Gibco), lOmM 
Hepes (Gibco) and IL-2 (4ng/ml). 1L-12 (5ng/ml) and anti-IL4 (1 ug/ml) were used to direct to 
Thl, while IL-4 (5ng/ml) and anti-IFN gamma (1 ug/ml) were used to direct to Th2 and IL-10 at 
5ng/ml was used to direct to Trl . After 4-5 days, the activated Thl, Th2 and Trl lymphocytes 
were washed once in DMEM and expanded for 4-7 days in DMEM 5% FCS (Hyclone), lOOuM 
non essential amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xlO" 5 M 
(Gibco), lOmM Hepes (Gibco) and IL-2 (lng/ml). Following this, the activated Thl, Th2 and Trl 
lymphocytes were re-stimulated for 5 days with anti-CD28/OKT3 and cytokines as described 
above, but with the addition of anti-CD95L (1 ug/ml) to prevent apoptosis. After 4-5 days, the 
Thl, Th2 and Trl lymphocytes were washed and then expanded again with IL-2 for 4-7 days. 
Activated Thl and Th2 lymphocytes were maintained in this way for a maximum of three cycles. 
RNA was prepared from primary and secondary Thl, Th2 and Trl after 6 and 24 hours following 
the second and third activations with plate bound anti-CD3 and anti-CD28 mAbs and 4 days into 
the second and third expansion cultures in Interleukin 2. 
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The following leukocyte cells lines were obtained from the ATCC: Ramos, EOL-1, KU- 
812. EOL cells were further differentiated by culture in O.lmM dbcAMP at SxlO'cells/ml for 8 
days, changing the media every 3 days and adjusting the cell concentration to 5xl0 5 cells/ml. For 
the culture of these cells, we used DMEM or RPMI (as recommended by the ATCC), with the 
addition of 5% FCS (Hyclone), lOOuM non essential amino acids (Gibco), ImM sodium pyruvate 
(Gibco), mercaptoethanol 5.5xlO~ 5 M (Gibco), lOmM Hepes (Gibco). RNA was either prepared 
from resting cells or cells activated with PMA at lOng/ml and ionomycin at 1 ug/ml for 6 and 14 
hours. Keratinocyte line CCD 106 and an airway epithelial tumor line NCI-H292 were also 
obtained from the ATCC. Both were cultured in DMEM 5% FCS (Hyclone), lOOuM non essential 
amino acids (Gibco), ImM sodium pyruvate (Gibco), mercaptoethanol 5.5xlO" 5 M (Gibco), and 
lOmM Hepes (Gibco). CCD 1 106 cells were activated for 6 and 14 hours with approximately 5 
ng/ml TNF alpha and lng/ml IL-1 beta, while NCI-H292 cells were activated for 6 and 14 hours 
with the following cytokines: 5ng/ml IL-4, 5ng/ml IL-9, 5ng/ml IL-1 3 and 25ng/ml IFN gamma. 

For these cell lines and blood cells, RNA was prepared by lysing approximately 
10 7 cells/ml using Trizol (Gibco BRL). Briefly, 1/10 volume of bromochloropropane (Molecular 
Research Corporation) was added to the RNA sample, vortexed and after 10 minutes at room 
temperature, the tubes were spun at 14,000 rpm in a Sorvall SS34 rotor. The aqueous phase was 
removed and placed in a 15ml Falcon Tube. An equal volume of isopropanol was added and left 
at -20°C overnight. The precipitated RNA was spun down at 9,000 rpm for 15 min in a Sorvall 
SS34 rotor and washed in 70% ethanol. The pellet was redissolved in 300pl of RNAse-free water 
and 35ul buffer (Promega) 5ul DTT, 7ul RNAsin and 8ul DNAse were added. The tube was 
incubated at 37°C for 30 minutes to remove contaminating genomic DNA, extracted once with 
phenol chloroform and re-precipitated with 1/10 volume of 3M sodium acetate and 2 volumes of 
100% ethanol. The RNA was spun down and placed in RNAse free water. RNA was stored at - 
80°C. 

AI_comprehensive panel_vl.O 

The plates for AI_comprehensive panel vl.O include two control wells and 89 test 
samples comprised of cDNA isolated from surgical and postmortem human tissues obtained from 
the Backus Hospital and Clinomics (Frederick, MD). Total RNA was extracted from tissue 
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samples from the Backus Hospital in the Facility at CuraGen. Total RNA from other tissues was 
obtained from Clinomics. 

Joint tissues including synovial fluid, synovium, bone and cartilage were obtained from 
patients undergoing total knee or hip replacement surgery at the Backus Hospital. Tissue samples 
were immediately snap frozen in liquid nitrogen to ensure that isolated RNA was of optimal 
quality and not degraded. Additional samples of osteoarthritis and rheumatoid arthritis joint 
tissues were obtained from Clinomics. Normal control tissues were supplied by Clinomics and 
were obtained during autopsy of trauma victims. 

Surgical specimens of psoriatic tissues and adjacent matched tissues were provided as total 
RNA by Clinomics. Two male and two female patients were selected between the ages of 25 and 
47. None of the patients were taking prescription drugs at the time samples were isolated. 

Surgical specimens of diseased colon from patients with ulcerative colitis and Crohns 
disease and adjacent matched tissues were obtained from Clinomics. Bowel tissue from three 
female and three male Crohn's patients between the ages of 41-69 were used. Two patients were 
not on prescription medication while the others were taking dexamethasone, phenobarbital, or 
tylenol. Ulcerative colitis tissue was from three male and four female patients. Four of the patients 
were taking lebvid and two were on phenobarbital. 

Total RNA from post mortem lung tissue from trauma victims with no disease or with 
emphysema, asthma or COPD was purchased from Clinomics. Emphysema patients ranged in age 
from 40-70 and all were smokers, this age range was chosen to focus on patients with cigarette- 
linked emphysema and to avoid those patients with alpha- 1 anti-trypsin deficiencies. Asthma 
patients ranged in age from 36-75, and excluded smokers to prevent those patients that could also 
have COPD. COPD patients ranged in age from 35-80 and included both smokers and non- 
smokers. Most patients were taking corticosteroids, and bronchodilators. 

In the labels employed to identify tissues in the AI_comprehensive panel_vl.0 panel, the 
following abbreviations are used: 

AI = Autoimmunity 

Syn = Synovial 

Normal = No apparent disease 

Rep22 /Rep20 = individual patients 
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RA =-■ Rheumatoid arthritis 
Backus = From Backus Hospital 
OA = Osteoarthritis 
(SS) (BA) (MF) = Individual patients 
5 Adj = Adjacent tissue 

Match control = adjacent tissues 
-M = Male 
-F = Female 

COPD = Chronic obstructive pulmonary disease 
10 Panels 5D and 51 

The plates for Panel 5D and 51 include two control wells and a variety of cDNAs isolated 
from human tissues and cell lines with an emphasis on metabolic diseases. Metabolic tissues were 
b obtained from patients enrolled in the Gestational Diabetes study. Cells were obtained during 

different stages in the differentiation of adipocytes from human mesenchymal stem cells. Human 
i 5 pancreatic islets were also obtained. 

In the Gestational Diabetes study subjects are young (18-40 years), otherwise healthy 
women with and without gestational diabetes undergoing routine (elective) Caesarean section. 
After delivery of the infant, when the surgical incisions were being repaired/closed, the 
obstetrician removed a small sample sample (<1 cc) of the exposed metabolic tissues during the 
io closure of each surgical level. The biopsy material was rinsed in sterile saline, blotted and fast 
J frozen within 5 minutes from the time of removal. The tissue was then flash frozen in liquid 
nitrogen and stored, individually, in sterile screw-top tubes and kept on dry ice for shipment to or 
to be picked up by CuraGen. The metabolic tissues of interest include uterine wall (smooth 
muscle), visceral adipose, skeletal muscle (rectus) and subcutaneous adipose. Patient descriptions 
25 are as follows: 

Patient 2 Diabetic Hispanic, overweight, not on insulin 
Patient 7-9 Nondiabetic Caucasian and obese (BMI>30) 
Patient 10 Diabetic Hispanic, overweight, on insulin 
30 Patient 1 1 Nondiabetic African American and overweight 

Patient 12 Diabetic Hispanic on insulin 

Adipocyte differentiation was induced in donor progenitor cells obtained from Osirus (a 
division of Clonetics/BioWhittaker) in triplicate, except for Donor 3U which had only two 
35 replicates. Scientists at Clonetics isolated, grew and differentiated human mesenchymal stem cells 
(HuMSCs) for CuraGen based on the published protocol found in Mark F. Pittenger, et al., 
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Multilineage Potential of Adult Human Mesenchymal Stem Cells Science Apr 2 1999: 143-147. 
Clonetics provided Trizol lysates or frozen pellets suitable for mRNA isolation and ds cDNA 
production. A general description of each donor is as follows: 

Donor 2 and 3 U: Mesenchymal Stem cells, Undifferentiated Adipose 
Donor 2 and 3 AM: Adipose, AdiposeMidway Differentiated 
Donor 2 and 3 AD: Adipose, Adipose Differentiated 

Human cell lines were generally obtained from ATCC (American Type Culture 
Collection), NCI or the German tumor cell bank and fall into the following tissue groups: kidney 
proximal convoluted tubule, uterine smooth muscle cells, small intestine, liver HepG2 cancer 
cells, heart primary stromal cells, and adrenal cortical adenoma cells. These cells are all cultured 
under standard recommended conditions and RNA extracted using the standard procedures. All 
samples were processed at CuraGen to produce single stranded cDNA. 

Panel 51 contains all samples previously described with the addition of pancreatic islets 
from a 58 year old female patient obtained from the Diabetes Research Institute at the University 
of Miami School of Medicine. Islet tissue was processed to total RNA at an outside source and 
delivered to CuraGen for addition to panel 51. 

In the labels employed to identify tissues in the 5D and 51 panels, the following 
abbreviations are used: 

GO Adipose = Greater Omentum Adipose 
SK = Skeletal Muscle 
UT = Uterus 
PL = Placenta 

AD = Adipose Differentiated 

AM = Adipose Midway Differentiated 

U = Undifferentiated Stem Cells 

Panel CNSD.01 

The plates for Panel CNSD.01 include two control wells and 94 test samples comprised of 
cDNA isolated from postmortem human brain tissue obtained from the Harvard Brain Tissue 
Resource Center. Brains are removed from calvaria of donors between 4 and 24 hours after death, 
sectioned by neuroanatomists, and frozen at -80°C in liquid nitrogen vapor. All brains are 
sectioned and examined by neuropathologists to confirm diagnoses with clear associated 
neuropathology. 
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Disease diagnoses are taken from patient records. The panel contains two brains from each 
of the following diagnoses: Alzheimer's disease, Parkinson's disease, Huntington's disease, 
Progressive Supernuclear Palsy, Depression, and "Normal controls". Within each of these brains, 
the following regions are represented: cingulate gyrus, temporal pole, globus palladus, substantia 
nigra, Brodman Area 4 (primary motor strip), Brodman Area 7 (parietal cortex), Brodman Area 9 
(prefrontal cortex), and Brodman area 17 (occipital cortex). Not all brain regions are represented 
in all cases; e.g., Huntington's disease is characterized in part by neurodegeneration in the globus 
palladus, thus this region is impossible to obtain from confirmed Huntington's cases. Likewise 
Parkinson's disease is characterized by degeneration of the substantia nigra making this region 
more difficult to obtain. Normal control brains were examined for neuropathology and found to be 
free of any pathology consistent with neurodegeneration. 

In the labels employed to identify tissues in the CNS panel, the following abbreviations 
are used: 

PSP = Progressive supranuclear palsy 
Sub Nigra = Substantia nigra 
Glob Palladus^ Globus palladus 
Temp Pole = Temporal pole 
Cing Gyr = Cingulate gyrus 
BA 4 = Brodman Area 4 

Panel CNS Neurodegeneration VI. 0 

The plates for Panel CNS_Neurodegeneration_V1.0 include two control wells and 47 test 
samples comprised of cDNA isolated from postmortem human brain tissue obtained from the 
Harvard Brain Tissue Resource Center (McLean Hospital) and the Human Brain and Spinal Fluid 
Resource Center (VA Greater Los Angeles Healthcare System). Brains are removed from calvaria 
of donors between 4 and 24 hours after death, sectioned by neuroanatomists, and frozen at -80°C 
in liquid nitrogen vapor. All brains are sectioned and examined by neuropathologists to confirm 
diagnoses with clear associated neuropathology. 

Disease diagnoses are taken from patient records. The panel contains six brains from 
Alzheimer's disease (AD) patients, and eight brains from "Normal controls" who showed no 
evidence of dementia prior to death. The eight normal control brains are divided into two 
categories: Controls with no dementia and no Alzheimer's like pathology (Controls) and controls 
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with no dementia but evidence of severe Alzheimer's like pathology, (specifically senile plaque 
load rated as level 3 on a scale of 0-3; 0 = no evidence of plaques, 3 - severe AD senile plaque 
load). Within each of these brains, the following regions are represented: hippocampus, temporal 
cortex (Brodman Area 21), parietal cortex (Brodman area 7), and occipital cortex (Brodman area 
17). These regions were chosen to encompass all levels of neurodegeneration in AD. The 
hippocampus is a region of early and severe neuronal loss in AD; the temporal cortex is known to 
show neurodegeneration in AD after the hippocampus; the parietal cortex shows moderate 
neuronal death in the late stages of the disease; the occipital cortex is spared in AD and therefore 
acts as a "control" region within AD patients. Not all brain regions are represented in all cases. 

In the labels employed to identify tissues in the CNS_Neurodegeneration_Vl .0 panel, the 

following abbreviations are used: 

AD = Alzheimer's disease brain; patient was demented and showed AD-like pathology 
upon autopsy 

Control = Control brains; patient not demented, showing no neuropathology 
Control (Path) = Control brains; pateint not demented but showing sever AD-like 
pathology 

SupTemporal Ctx = Superior Temporal Cortex 
Inf Temporal Ctx = Inferior Temporal Cortex 

A. NOVla: Delta serrate ligand receptor (also known as MEGF) 

Expression of the NOVla gene (COR87920446_A) was assessed using the primer-probe 
set Ag3978, described in Table Al. Results of the RTQ-PCR runs are shown in Tables A2, A3 
and A4. 



TgbleAl. Pr obe Name Ag3978 



Primers] Sequences (Length 

Forward *5'-ctggaccgaagctacagctata-3' j 22 


Start Position ; SeqIDNo. 

"2605 ~j~ 86" 


Probe jTFr-5'-atggcccaggcccattctacaataaa-3'-TAMRA| 26 
Reverse |5'-cgagctcctcttcagagatga-3' ] 21 


2636 \ 87_ 
2666 J" 88" 



Table A2. CNS neurodegeneration^yl j) 



Tissue Name 



Rel. Exp.(%) Ag3978, 
Run 206880050 



Tissue Name 



j Rel. Exp.(%) Ag3978, 
Run 206880050 



AD 1 Hippo 



21.2 



Control (Path) 3 
Temporal Ctx 




17.9 



AD 2 Hippo 



AD 3 Hippo 



43.5 



7.5 



Control (Path) 4 
Temporal Ctx 

lAD 1 Occipital Ctx 



42.9 



22.4 
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AD 4 Hippo 



8.9 



|AD 2 Occipital Ctx 
{(Missing) 



}AD 5 Hippo 
AD 6 Hippo 



_ j . 



20.2 



IAD 3 Occipital Ctx 



Control 2 Hippo 



Control 4 Hippo 



100.0 



13.6 



|AD 4 Occipital Ctx 
(AD 5 Occipital Ctx 



8.3 



IAD 6 Occipital Ctx 



0.0 



6.4 



8.7 



65.1 



40.3 



Control (Path) 3 Hippo j 


^ ^ jControl 1 Occipital 

jC'tx | 


6.6 


AD 1 Temporal Ctx 


21.6 


Control 2 Occinital 
Ctx 


20.7 


AD 2 Temporal Ctx 


26.2 


Control 3 Occipital 
Ctx 


18.0 


AD 3 Temporal Ctx 


0.0 


Control 4 Occipital 
Ctx 


4.2 


AD 4 Temporal Ctx 


9.2 


Control (Path) 1 
Occipital Ctx 


36.1 


AD 5 Inf Temporal Ctx 


24.5 


Control (Path) 2 
Occipital Ctx 


30.6 


AD 5 Sup Temporal Ctx 


17.7 ; 


Control (Path) 3 
Occipital Ctx 


16.5 

— ' * — 


AD 6 Inf Temporal Ctx 


84.7 ! 


Control (Path) 4 
Occipital Ctx 


60.7 


AD 6 Sup Temporal Ctx 


79.6 


Control 1 Parietal Ctx 


| 61 


Control 1 Temporal Ctx 


— — — -■■ — 

2.5 


Control 2 Parietal Ctx 


! 13.7 


Control 2 Temporal Ctx 


17.9 


Control 3 Parietal Ctx 


H*^ 23.2 


Control 3 Temporal Ctx 


17.8 


Control (Path) 1 
Parietal Ctx 


32.8 


Control 3 Temporal Ctx 


9.2 


Control (Path) 2 
Parietal Ctx 


41.5 


(Control (Path) 1 
Temporal Ctx 


79.6 


Control (Path) 3 
Parietal Ctx 


24.8 


Control (Path) 2 
Temporal Ctx 


23.3 


Control (Path) 4 
Parietal Ctx 


31.0 



Tissue Name 


Rel. Exp.(%) Ag3978, 
Run 217525358 


Tissue Name 


Rel. Exp.(%) Ag3978, 
Run 217525358 


Adipose 


3.1 


Renal ca. TK- 10 


5.4 


Melanoma* 
Hs688(A).T 


3.9 


Bladder 


1.2 


Melanoma* 
Hs688(B).T 


10.0 

i . ................ 


Gastric ca. (liver met.) 
[NCI-N87 


0.2 
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(Melanoma* M14 ^ OO ^ _ _ J^^^J^93, 



Melanoma* \ 3 2 Icolon ca. SW-948 j 
LOXIMVI \ \ J 


0.4 j 


Melanoma* SK- 1 
MEL-5 

Squamous cell 
carcinoma SCC-4 \ 


0.0 l 

1.1 j 


Colon ca. SW480 1.1 

Colon ca.*(SW480 , Q1 
met) SW620 


Testis Pool 


2.4 


Colon ca.HT29 j 0.1 


Prostate ca.* (bone ; 
met) PC-3 


4.8 


Colon ca.HCT-1 16 


1.6 


Prostate Pool 


1.1 


Colon ca. CaCo-2 j 


0.3 


Placenta 


1.8 


Colon cancer tissue 


2.0 


Uterus Pool 


1.1 


Colon ca. SW1116 


0.6 


Ovarian ca. OVCAR- 
3 


A 1 

0.1 


Colon ca. Colo-205 

„ „„„ _____ . „„„„„ ._J 


0.0 


Ovarian ca. SK-OV-3j 


92.7 


Colon ca. SW-48 j 0.1 


Ovarian ca. OVCAR- 
4 


— - -I 

0.1 


Colon Pool 


4.2 


Ovarian ca. OVCAR-j 
5 


11.5 


Small Intestine Pool 


5.6 


[Ovarian ca. IGROV- 
1 


0 1 


Stomach Pool 


92.7 


| Ovarian ca. OVCAR- 
8 


0.5 


Bone Marrow Pool 


1.7 

i ........ 


Ovary 


1.1 


Fetal Heart 1.5 


I Breast ca. MCF-7 


0.1 


Heart Pool 1 1.3 


Breast ca. MDA-MB- 
231 


1.0 


Lymph Node Pool 


1 6.5 


Breast ca. BT 549 


0.0 


Fetal Skeletal Muscle 


Breast ca. T47D 
! Breast ca. MDA-N 


21.3 
0.0 


^Skeletal Muscle Pool 1.5 
r SpleenPooi j 2.1 


Breast Pool 


100.0 


Thymus Pool 


i 3.1 


\ I racnea 


1 Q 

i .y 


CNS cancer (glio/astro) 
U87-MG 


0.1 


Lung 


0.5 


CNS cancer (glio/astro) 
U-118-MG 


0.3 


Fetal Lung 


4.9 


CNS cancer (neuro;met) 
SK-N-AS 


3.8 


Lung ca. NCI-N417 
Lung ca. LX-1 


91.4 


: CNS cancer (astro) SF- \ A ~ 

A 3 __. _ j. ;_ _ 

CNS cancer (astro) j 0.2 ^ 
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iSNB-75 



Lung ca. NCI- H 146 I 




0.0 


CNS cancer (glio) SNB- . 
19 i 




0.0 ! 


^^^^ >— ~~ f 

jLung ca. SHP-77 

I * 

Lung ca. A549 j 
Lung ca. NCI-H526 I 
Lungca. NCI-H23 

Lung ca. NCI-H460 j 


0.1 

0.5 j 

o.d~ ; 

0.1 

88.3 


CNS cancer (glio) SF- j 

Brain (Amygdala) Pool j 
Brain (cerebellum) J 
Brain (fetal) 

Brain (Hippocampus) j 
Pool j 


0.0 
"0.3 

0.3 j 
0.8~ 

0.6 


Lung ca. HOP-62 J 


0.5 


Cerebral Cortex Pool 0.5 


Lung ca. NCI-H522 


0.3 


Brain (Substantia nigra) j 
Pool 


0.8 


— 

Liver 


0.3 


Brain (Thalamus) Pool j 


0.5 


Fetal Liver 


2.9 


Brain (whole) 


95.9 


\ Liver ca. HepG2 


0.0 


Spinal Cord Pool 


~ 0.6 


i Kidney Pool 


8.0 


Adrenal Gland 


2.3 


Fetal Kidney 


2.9 


Pituitary gland Pool 


0.6 


| Renal ca. 786-0 


0.0 


Salivary Gland 


0.4 


1 Renal ca. A498 


0.5 jThyroid (female) 


1.1 


Renal ca. ACIIN 


2.0 jPancreatic ca. CAPAN2 


0.8 


Renal ca.UO-31 


2.2 JPancreas Pool 


3.0 


Table A4. Panel 4.1D 


Tissue Name 


Rel. Exp.(%) | 

Ag3978, Run Tissue Name 
170737278 j 


Rel. Exp.(%) 
Ag3978, Run 
170737278 


Secondary Thl act 


0.0 fHUVEC IL-lbeta 


33.4 


Secondary Th2 act 


0.0 


HUVEC IFN gamma 


28.1 


Secondary Trl act 


0.0 


HUVEC TNF alpha + IFN 
gamma 


25.9 


[Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 , 


46.7 


Secondary Th2 rest 


0.0 


HUVEC IL- 11 


23.3 


| Secondary Trl rest 
Primary Thl act 


0.0 
0.0 


.Lung Microvascular EC 
none 

■Lung Microvascular EC 
TNFalpha + IL-lbeta 


100.0 

39.0 


; Primary Th2 act 


0.0 


Microvascular Dermal EC 
jnone 


66.9 

. . — j 


I Primary Trl act 


0.0 

! : 


] ' Micros vasular Dermal EC 
TNFalpha I- IL-lbeta 


36.3 
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Primary Thl rest 

y i 


iBronchial epithelium f 
[TNFalpha + 1L 1 beta j 


7.8 ! 

i 

^ \ 


Primarv Th2 rest 
Primarv Trl rpst i 

x i iiiicii y x i x i wol p 


I Small airway epithelium j 
'none j 
i Small airway epithelium j 
U,U jTNFalpha + IL-1 beta j 


15 ! 

22.7 j 


CD45RA CD4 
lymphocyte act 


" ■ " j 

1.2 ICoronery artery SMC rest j 14.5 

i L „ 


CD45RO CD4 
lymphocyte act 


iCoronery artery SMC ] 
U -° JTNFalpha +IL : lbeta j 


9.0 


CD8 lymphocyte act 


0.0 jAstrocytes rest 


0.9 


Secondary CD8 
lymphocyte rest 

Secondary CD8 
lymphocyte act 


] Astrocytes TNFalpha + IL- 
°- U jlbeta^ 

0.0 jKU-8 12 (Basophil) rest 

i 


1 4 

1 .*t 

2.3 


f~*T*\A. 1"\/TnrVhnf , \7tp tump s 
K^/LJH i jiiiuiiuuy lc nunc; ] 


0.0 


KU-8 12 (Basophil) 
PMA/ionomycin 


1.6 


2ryThl/Th2/Trl_anti- 
CD95 CHI 1 


0.0 


CCD1106(Keratinocytes) j 
none 


0.3 


iLAK cells rest 


0.0 


CCD 1 1 06 (Keratinocytes) 1 
TNFalpha + IL-lbeta 


0.0 


LAK cells IL-2 ; 
LAK cells IL-2+IL-12 


' 0.0 " 
0.0 j 


Liver cirrhosis 
NCI-H292 none 


"0.9" 
1.3 


LAK cells IL-2+IFN 
(gamma 


0.0 


NCI-H292 IL-4 


3.4 


LAK cells IL-2+ IL-18 


0.0 


NCI-H292 IL-9 


J.J 


j LAK cells 
PMA'ionomycin 


0.3 


NCI-H292 IL-1 3 


3.4 


NK Cells IL-2 rest j 0.0 


NCI-H292 IFN gamma 


? Q 


Two Way MLR 3 day 


0.0 


HPAEC none 


49.7 


Two Way MLR 5 day 


0.0 


HPAEC TNF alpha + IL-1 
beta 


40.9 


Two Way MLR 7 day 


0.0 


Lung fibroblast none 


4.8 


PBMC rest 


jLung fibroblast TNF alpha 
U |+ IL-1 beta 


2.7 


PBMC PWM 


0.0 jLung fibroblast IL-4 


2.8 


PBMC PHA-L 

I Ramos (B cell) none 


0.0 

"6b 


JLung fibroblast IL-9 
Lung fibroblast IL-1 3 


7.6""" 


Ramos (B cell) 
ionomycin 


0.0 


Lung fibroblast IFN gamma 


2.4 


B lymphocytes PWM 


0.0 


Dermal fibroblast 


5.1 
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jCCDl 070 rest 



[B lymphocytes CD40L j ft0 jDermal fibroblast 
|andIL-4 \ jCCD 1 070 TNF alpha 


1 

0.5 ! 

j 


FOT -1 r\hr AMP 

|eOL-1 dbcAMP " 
[PMA/ionomycin 


n , JDermal fibroblast 
JCCD 1 070 IL-1 beta 

j j jDermal fibroblast IFN 
jgamma 


1.6 

u.u 


Dendritic cells none 


0.3 (bermal fibroblast IL-4 


0.0 


Dendritic cells LPS 


0.0 jDermal Fibroblasts rest 


0.4 


Dendritic cells anti- 
CD40 


j ° " **" ■ t "" ' 

0.0 ]Neutrophils TNFa+LPS 




Monocytes rest 


0.0 


Neutrophils rest 


0.6 


Monocytes LPS 


0.0 


Colon 


0.2 


Macrophages rest 


0.3 j 


Lung 


4.5 


Macrophages LPS 


0.0 


Thymus 


2.1 


HUVEC none 


42.9 


Kidney 


2.2 


HUVEC starved 


56.6 







CNS_neurodegeneration_vl.O Summary: Ag3978 This panel confirms expression of 
the COR87920446_A gene at low levels in the brain in an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's diseased 
postmortem brains and those of non-demented controls in this experiment. Please see Panel 1.4 



W5 for a discussion of the potential utility of this gene in the treatment of central nervous system 
disorders. 

General_screening_panel_vL4 Summary: Ag3978 Expression of the COR87920446_A 
gene is highest in samples derived from normal breast, stomach and brain tissues (CTs = 26.6). 
Thus, the expression of this gene could be used to distinguish these samples from the other 
1 0 samples in the panel. In addition, there is substantial expression of this gene associated with an 
ovarian cancer cell line and two lung cancer cell lines. Therefore, therapeutic modulation of the 
activity of this gene or its protein product, through the use of small molecule drugs, protein 
therapeutics or antibodies, might be beneficial in the treatment of lung cancer or ovarian cancer. 

In addition, this gene is expressed at low levels in all CNS regions examined, including 
15 amygdala, cerebellum, hippocampus, cerebral cortex, substantia nigra, thalamus and spinal cord 
(CTs =33-35). Interestingly, COR87920446_A gene expression is significantly higher in adult 
brain (CT = 26.6) than in fetal brain (CT = 33.5), suggesting that expression of this gene may be 
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used to distinguish between adult and fetal brain. This gene encodes a protein with homology to 
the MEGF protein, and may therefore possibly interact with Notch receptors in 
neurodevelopment. This protein could therefore be of use in directing compensatory 
synaptogenesis in clinical conditions involving neuronal death such as stroke and head trauma, 
and neurodegenerative diseases such as Alzheimer's, Parkinson's, and Huntington's diseases. 

This gene is also expressed at low to moderate levels in a number of tissues with 
metabolic or endocrine function, including adipose, adrenal gland, gastrointestinal tract, pancreas, 
skeletal muscle and thyroid. Therefore, therapeutic modulation of the activity of this gene may 
prove useful in the treatment of endocrine/metabolically related diseases, such as Type II diabetes. 

Panel 4.1D Summary: Ag3978 The COR87920446_A gene is expressed at low to 
moderate levels in endothelial cells (HUVEC, HPAEC) as well as in epithelium (CTs = 30-32). 
Activation with a variety of cytokines does not significantly change expression. This gene may 
encode a ligand for Notch; Notch-ligand interactions play an essential role during limb, 
craniofacial, and thymic development in mice. Multiple ligands that activate Notch and related 
receptors have been identified, including Serrate and Delta in Drosophila and JAG1 in vertebrates 
[602570; OMIM]. This family of molecules is also important in fate determination and 
development. Therefore, therapeutics designed with the protein encoded for by this transcript 
could be important for wound healing and organogenesis. Such therapeutics could be important in 
the treatment of emphysema, psoriasis, arthritis, cirrhosis and inflammatory bowel disease, where 
there is considerable damage due to inflammation or aberrant would healing. 

References: 

1. Shutter JR, Scully S, Fan W, Richards WG, Kitajewski J, Deblandre GA, Kintner CR, 
Stark KL. D114, a novel Notch ligand expressed in arterial endothelium. Genes Dev 2000 Jun 
l;14(ll):1313-8 

We report the cloning and characterization of a new member of the Delta family of Notch 
ligands, which we have named D114. Like other Delta genes, DU4 is predicted to encode a 
membrane-bound ligand, characterized by an extracellular region containing several EGF-like 
domains and a DSL domain required for receptor binding. In situ analysis reveals a highly 
selective expression pattern of D114 within the vascular endothelium. The activity and expression 
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of D114 and the known actions of other members of this family suggest a role for D114 in the 
control of endothelial cell biology. 

PMID: 10837024 

B. NOV2: Novel Kinase 

Expression of the NOV2 gene (COR87940554) was assessed using the primer-probe set 
Ag3979, described in Table Bl. Results of the RTQ-PCR runs are shown in Tables B2, B3, and 
B4. 



Primers \ Sequences 


LengthjStart Position 


Seq ID No. 


Forward Is'-gctccttcaagacggtgtatc-S' 


! 21 


612 1 


89 


Probe |TET-5'-ctagacaccgacaccacagtggaggt-3'"TAMRA 


26 


638 


90 


Reverse iB'-ccgctcagctctagacagttt-B 1 


! 21 


689 


91 



Tissue Name 1 


Rel. Exp.(%) Ag3979, 1 
Run 217534174 


Tissue Name 


Rel. Exp.(%) Ag3979, 
Run 217534174 


Adipose 


1.3 


Renal ca. TK-10 


1 A C 

14.5 


Melanoma* 
Hs688(A).T 


20.7 


Bladder 


0.6 


Melanoma* 
Hs688(B).T 


91.4 


Gastric ca. (liver met.) 
NCI-N87 


6.7 


Melanoma* Ml 4 \ 


8.6 


Gastric ca. KATO III : 


0.3 


Melanoma* 
LOXIMVI 


4.2 


Colon ca. SW-948 


~ z 




Melanoma* SK- 
MEL-5 


0.8 


Colon ca. SW480 


4.5 


Squamous cell 
carcinoma SCC-4 


0.5 


Colon ca * (SW480 
met) SW620 


5.9 


Testis Pool 


I 0.8 


Colon ca. HT29 


24.3 


Prostate ca.* (bone 
met) PC-3 


100.0 


Colon ca. HCT-116 


5.1 


Prostate Pool 


15.4 


Colon ca. CaCo-2 


39.8 


Placenta 


0.0 


Colon cancer tissue 


24.1 


Uterus Pool 


0.3 


Colon ca. SW1116 


0.6 


Ovarian ca. OVCAR- 
3 


0.5 


Colon ca. Colo-205 


0.2 


Ovarian ca. SK-OV-3 


0.7 


Colon ca. SW-48 


"~ "1518 ' 


Ovarian ca. OVCAR- 


0.5 


Colon Pool 


"~6.o 
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Ovarian ca. OVCAR-J 

5 ! 



Ovarian ca. TGROV- 
1 



T 



Ovarian ca. OVCAR-J 
8 



Ovary 

Breast ca. MCF-7 

■4,*. 



Breast ca. MDA-MB-j 
231 



Breast ca. BT 549 



Breast ca. T47D 



11.5 



iSmall Intestine Pool 



4- 



1.3 



2.4 



Stomach Pool 
Bone Marrow Pool 



3.2 



I Fetal Heart 



0.8 



6.1 



jHeart Pool 

r 

jLymph Node Pool 



21.8 



16.0 



Fetal Skeletal Muscle 



Skeletal Muscle Pool 



2.9 



0.5 



0.6 



0.5 



0.1 



0.3 



0.0 



0.0 



•*-H 



Breast ca. MDA-N 



Breast Pool 
Trachea 



0.9 



Spleen Pool 



0.2 



Thymus Pool 



6.3 



CNS cancer (glio/astro) 
U87-MG 



0.3 
0.1 



0.3 



Lung 

Fetal Lung 



1.2 



2.9 



CNS cancer (glio/astro) ' 
U-118-MG 

CNS cancer (neuro;met) J 
SK-N-AS J 



0.3 
0.1 



Lungca. NCI-N417 



Lung ca. LX-1 



0.0 



CNS cancer (astro) SF- 
539 



6.9 



CNS cancer (astro) 
SNB-75 



0.0 



1.7 



Lung ca. NCI-H146 



Lung ca. SHP-77 



0.1 



CNS cancer (glio) SNB- 
19 



1.0 



CNS cancer (glio) SF- 
295 



0.7 



11.4 



Lung ca. A549 



5.4 



Brain (Amygdala) Pool 



0.3 



Lung ca. NCI-H526 



Lung ca. NCI-H23 



Lung ca. NCI-H460 



Lung ca. HOP-62 



Lung ca. NCI-H522 



0.1 



'Brain (cerebellum) 



3.1 



0.8 



12.8 



5.6 



Brain (fetal) 



Brain (Hippocampus) 
Pool 



Cerebral Cortex Pool 



Brain (Substantia nigra) i 
Pool 



1.7 
2.1 



1.3 



0.5 



0.2 



Liver 



0.0 



Brain (Thalamus) Pool 



0.3 



Fetal Liver 



Liver ca. HepG2 



52.5 



Brain (whole) 



28.7 



Spinal Cord Pool 



1.0 



0.4 



Kidney Pool 



0.0 



jAdrenal Gland 
193 



0.2 



Fetal Kidney 
Renal ca. 786-0 



Renal ca. A498 
Renal ca. ACHN 



T 
+ 



24.5 



0.9 



Pituitary gland Pool 
: Salivary Gland 



0.6 
0.6 



1.6 



iThyroid (female) 



0.0 



32.1 



■Pancreatic ca. CAPAN2 { 



1.5 



Renal ca. UO-31 



20.7 



'Pancreas Pool 



1.6 



Tissue Name 

j 

Normal Colon 


R V* P S,tf™ 79 'i Tissue Name 1 
Run 170721574 J i 

Q flCidney Cancer 
]9010320 


ReL Exp.(%) Ag3979,| 

Run 170721574 I 

i 

0.0 


Colon cancer (OD06064) 


jjKidney margin 
U |9010321 


44.4 


Colon cancer margin 


, jKidney Cancer 

8120607 \ 


1.5 


Colon cancer (OD061 59) \ 


3.5 


Kidney margin 
8120608 


8.7 


(OD06159) 


2.4 


Normal Uterus 


0.0 


Colon cancer (OD06298- 
08) 


30.8 


Uterus Cancer 


0.6 


f^nlnn p^nrpr marenn 

V_^UHJli Wdll^C/1 lllCilglll 

(OD06298-018) 


12.7 
4.7 


Normal Thyroid 


0.0 


colon (OD03921) 


Thyroid Cancer 


0.0 


r^olon r^ancer margin 
(OD03921) 


4.1 


Thyroid Cancer 
A302152 


0.0 


1 

(OD06104) 


0.8 


Thyroid margin n n 
A302153 ~ | 


Lung margin (OD061 04) 


1.2 


Normal Breast J 3.2 


Colon mets to lung 
(OD04451-01) 

Lung margin (OD0445 1 - 
02) 


9.5 
0.0 


Breast Cancer 
Breast Cancer 


3.4 
2.9 


[Normal Prostate 


9.3 


Breast Cancer 
(OD04590-01) 


1.3 


| Prostate Cancer (OD04410) 


6.4 


Breast Cancer Mets 
(OD04590-03) 


2.3 


Prostate margin (OD04410) 


9.0 


Breast Cancer 
Metastasis 


100.0 

I 


Normal Lung 
Invasive poor diff. lung 


0.2 

[ 0.3 "' 


Breast Cancer 
Breast Cancer 


| 0.0 
1" " ~2.8 
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adeno 1 (ODO4945-01) ' 19100266 : < 


Lung margin (OD04945- ' JBreast margin ; i 
03) j |9 100265 \ 


Lung Malignant Cancer _ _ 
(01)03126) 

Lung margin (OD03 126) \ 0.6 

j 


Breast Cancer < 
A209073 \ 

Breast margin * 
A2090734 ; 


Lung Cancer (OD05014A) 


0.0 


Normal Liver j 0.0 

- 2 - - n -rim n rr rr ? 1 


Lung margin (OD05014B) 


0.0 


Liver Cancer 1026 \ 2.6 

~~ -~^z- _ _ „ 


Lung Cancer (OD04237- 
01) 


0 0 


1 

Liver Cancer 1U25 j U.3 


Lung margin (OD04237- 
02) 


0.0 


Liver Cancer 6004-T? 0.6 


Ocular Mel Met to Liver 
(ODO4310) 

Liver margin (OD043 1 6) 


3.9 


Liver Tissue 6004-N j 1.4 
Liver Cancer 6005-T j 4.4 


Melanoma Mets to Lung 
(OD04321) 


13.9 


Liver Tissue 6005-N j 0.0 


Lung margin (OD04321) 


0.0 


Liver Cancer T 1.7 


Normal Kidney 


19.9 


Normal Bladder 


0.0 


Kidney Ca, Nuclear grade 2 
(OD04338) 


76.8 


Bladder Cancer 




6.7 

..... ..... 


Kidney margin (OD04338) 


1.5 


Bladder Cancer \ 0.0 


Kidney Ca Nuclear grade 
1/2 (OD04339) 


0.7 


Normal Ovary 


1.8 


Kidney margin (OD04339) 


19.1 


Ovarian Cancer 


0.0 


Kidney Ca, Clear cell type 
(OD04340) 


0.0 


Ovarian cancer 
(OD06145) 


0.0 


Kidney margin (OD04340) 


15.4 


Ovarian cancer 
margin (OD06145) 


0.0 


Kidney Ca, Nuclear grade 3 
(OD04348) 


0.0 


Normal Stomach 


1.8 


Kidney margin (OD04348) j 


20.7 


Gastric Cancer 
9060397 


2.5 


Kidney Cancer (OD04450- 
01) 


1.4 
42.9 


Stomach margin 
9060396 


1.2 


Kidney margin (OD04450- 
03) 


Gastric Cancer 
9060395 


1.0 


Kidney Cancer 8120613 


0.0 


Stomach margin 
9060394 j 


Kidney margin 8120614 9.9 


Gastric Cancer L0 
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Table B4. Panel 4.1 D 



I issue name 


ReL Exp.(%) 

i\Qjy i ivu u j ii5»uc ndiiic 
170721251 


T» 1 T7« /ft/ \ 

ReL Exp.(%) 
AalQ79 Run 
170721251 


oGConQary iqi dci 


0 0 ilTTTVFC TT -1heta ! 


1 2 


Secondary Th2 act 


0.0 HUVEC IFN gamma 


1.8 


Secondary Trl act 


Q0 [HUVEC TNF alpha + IFN 
igamma 


0.2 


Secondary Thl rest 


0.0 'HUVEC TNF alpha + IL4 | 


0.3 


Secondary Th2 rest 


0.0 JHUVEC IL-11 


0.9 


Secondary Trl rest 


jLung Microvascular EC 
Jnone 


100.0 


Primary Thl act 


0.1 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


58.2 
72.2 


Primary Th2 act 


0.0 


Microvascular Dermal EC 
none 


Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


48.0 


Primary Thl rest 


n n (Bronchial epithelium 
iTNFalpha + ILlbeta 


3.4 


Primarv Th2 rest 


0.0 


Small airway epithelium 
none 


0.0 


Primarv Trl rest 


0.0 


Small airway epithelium 
TNFalpha + IL-lbeta 


0.7 

i 
i 


CD45RA CD4 
lymphocyte act 


5.2 


Coronery artery SMC rest 


39.5 


CD45RO CD4 
lymphocyte act 


0.0 


Coronery artery SMC 
TNFalpha + IL-lbeta 


40.6 


CD 8 lymphocyte act 


0.0 


Astrocytes rest 


12.1 


Secondary CD8 
lymphocyte rest 


0.0 


Astrocytes TNFalpha + IL- 
lbeta 


27.5 


Secondary CD8 
lymphocyte act 


0.0 


r r - - ■ " 

iKU-812 (Basophil) rest 


0.0 


CD4 lymphocyte none 


0.0 


KU-8 12 (Basophil) 
PMA/ionomycin 


0.0 


2ry Thl/Th2/Trl anti- 
CD95CH11 


0.0 


CCD 1106 (Keratinocytes) 
none 


1.7 


LAK cells rest 


0.0 


CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


1.3 


LAK cells IL-2 


0.0 


Liver cirrhosis 


0.6 
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LAK cells IL-2+IL- 12 . 

LAK cells IL-2+IFN 
gamma j 


0.0 5NCI-H292 none ! 0.5 

0.0 ;NCI-H292 IL-4 j 0.8 j 


uSTceils IL-2+JL-18 ~j 


s\ r\ \TrtT TT^A^ TT C\ 5 

0.0 5NCI-H292 IL-9 | 


i.Z ! 

T , ,. j 


LAK cells j 

PMA/ionomycin 

" — f 

NK Cells IL-2 rest ■ 

Two Way MLR 3 day 
Two Way MLR 5 day ' 


0.0 jNCI-H292IL-13 

0.0 |NCI-H292 lrN gamma 
0.0 jHPAEC none 
Q() jHPAECTNF alpha + IL-1 
,beta 


0.3 
u.o 

3.4 

7.5 


: Two Way MLR 7 day 


0.2 JLung fibroblast none 1 


O.J 


PBMC rest 


lEung fibroblast TNF alpha 
|+ IL-1 beta 


0.1 


PBMC PWM : 


0.0 |Lung fibroblast IL-4 


0.0 


PBMC PHA-L 


0.0 f 


Lung fibroblast IL-9 


0.0 


| Ramos (B cell) none 


0.0 


Lung fibroblast IL-1 3 


0.6 


Ramos (B cell) 
lionomycin 


0.4 


Lung fibroblast IFN gamma 


0.9 


\Vt K/mnhnrvtP<; PW1VT 


0.0 


Dermal fibroblast 
CCD 1070 rest 


14.3 


B lymphocytes CD40L 
and IL-4 


0.0 


Dermal fibroblast 
CCD1070 TNF alpha 


9.9 


EOL-1 dbcAMP 


(Dermal fibroblast 
0,9 JCCD1070 IL-1 beta 


11.2 


EOL-1 dbcAMP 
PMA/ionomycin 


1 {Dermal fibroblast IFN 
jgamma 


0.9 


Dendritic cells none 


0.0 


Dermal fibroblast IL-4 


u.z 


Dendritic cells LPS 


0.0 


Dermal Fibroblasts rest 


u.u 


1 Dendritic cells anti- 
CD^ 

1 Monocytes rest 


0.3 
0.0 


Neutrophils TNFa+LPS 
Neutrophils rest 


0.3 
6.3 


I Monocytes LPS 
Macrophages rest 
Macrophages LPS 
HUVEC none 
HUVEC starved 


0.0 

j 0.0 " 


Colon 

Lung 

Thymus 


; 2.0 

i "6j0 


t~~~' 0.0"""" 


i 45 


3.5 


Kidney 


29^1 ' 


1.5" 







CNS_neurodegeneration_vl.O Summary: Ag3979 Expression of the COR87940554 
gene is low/undetectable (CTs > 35) across all of the samples on this panel (data not shown). 
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General screeningjpanel vl.4 Summary: Ag3979 Expression of the COR87940554 
gene is highest in prostate cancer cell line PC-3 and a melanoma cell line (CT = 28). Thus, the 
expression of this gene could be used to distinguish these cells from the other samples in the 
panel. In addition, there is substantial expression of this gene associated with kidney cancer cell 
lines and colon cancer cell lines. Therefore, therapeutic modulation of the activity of this gene or 
its protein product, through the use of small molecule drugs, protein therapeutics or antibodies, 
might be of benefit in the treatment of kidney cancer, prostate cancer, colon cancer or melanoma. 
Finally, expression of this gene is much higher in fetal liver (CT - 29) than adult liver (CT = 40), 
as well as in fetal kidney (CT = 30) than adult kidney (CT - 40). This observation suggests that 
expression of this gene may be used to distinguish fetal from adult liver or kidney. 

This gene encodes a protein with homology to kinases and is expressed at very low levels 
in the fetal brain, hippocampus, and cerebellum. This gene is predominantly expressed in fetal 
tissues and in cancer cell lines, suggesting that it plays a role in cell division or differentiation. 
Thus, this gene may therefore be of use in regulation of the cell cycle in stem cell research or 
therapy. 

Panel 2.1 Summary: Ag3979 Expression of the COR87940554 gene is highest in a 
sample derived from a metastatic breast cancer (CT = 30.9). Thus, the expression of this gene 
could be used to distinguish this metastatic breast cancer specimen from other samples in the 
panel. In addition, there appears to be substantial expression of this gene associated with a number 
of normal kidney tissue samples adjacent to malignant kidney. Therefore, therapeutic modulation 
of the activity of this gene or its protein product, through the use of small molecule drugs, protein 
therapeutics or antibodies, might be of benefit in the treatment of breast and kidney cancer. 

Panel 4.1D Summary: Ag3979 Expression of this gene is highest in lung microvascular 
endothelial cells (CT - 29.7). The COR87940554 gene is also expressed in fibroblasts, 
endothelium, and smooth muscle cells. This gene encodes a putative protein kinase that localizes 
to the nucleus based on PSORT analysis. The protein encoded for by this transcript may be 
important in the normal function of the fibroblasts, endothelial cells and smooth muscle cells. 
Therefore, therapies designed with the protein encoded for by this transcript could be used to 
regulate fibroblast, endothelium and smooth muscle cell function and may be important in the 
treatment of asthma, emphysema, arthritis, and inflammatory bowel disease. 
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C. NOV8a and NOV8b: GPCR 



Expression of the NOV8a gene (CG56663-01) and its variant NOV8b (CG56663-02) was 
assessed using the primer-probe set Ag2971, described in Table CI . Results of the RTQ-PCR runs 
are shown in Tables C2, C3 and C4. NOV8b represents a full-length physical clone of the 



NOV8a gene, validating the prediction of the gene sequence. 
Table CI. Probe Name Ag2971 


— ..... _ . — .......... - s _ , - — - - — 

Primers! Sequences (Length) Start Position 


Seq ID No. 


Forwardj5'-gtaaaggcatctccacctgact-3' [ 22 \ 947 


92 


Probe jTET-5'-tcacttccatccagggccactgg-3'-TAMRA| 23 ? 969 


93 


Reverse |5'-gggctaatatcagctggaattc-3' j 22 \ 1009 


94 



Table C2, CNS neurodegeneration vLQ 



Tissue Name 


Rel. Exp.(%) Ag2971, 
Run 209778983 


Tissue Name 


Rel. Exp.(%) Ag2971, 
Run 209778983 


AD 1 Hippo 


6.5 


Control (Path) 3 
Temporal Ctx ! 


AD 2 Hippo 


31.6 


Control (Path) 4 
Temporal Ctx 


51.4 

T4.9"' 


AD 3 Hippo 


1.8 


AD 1 Occipital Ctx 


AD 4 Hippo 


15.1 


AD 2 Occinital Ctx 
(Missing) 


0.0 


AD 5 hippo 


52.9 


AD 3 Occipital Ctx 


__9- 8 _ 


AD 6 Hippo 


19.6 


AD 4 Occipital Ctx 


20.9 


V^VJiili \J \ jL, JL l liJljKj 


18 9 


AD 5 Occipital Ctx 


11.9 


Control 4 Hippo 


3.9 


AD 6 Occipital Ctx 


26.1 


Control (Path) 3 
Hippo 


3.8 


Control 1 Occipital 
Ctx 


a a . ; jWt J ..j.. J ..u.L..!n....*.mmni,.j(ji. * „.,.,„nu„„„„„> „,.„„„ ™. 

2.8 


AD 1 Temporal Ctx 


9.5 


Control 2 Occipital 
Ctx 


46.3 


AD 2 Temporal Ctx 


22.1 


Control 3 Occipital 
Ctx 

Control 4 Occipital 
Ctx 


13.2 


AD 3 Temporal Ctx 


3.1 


4.4 


AD 4 Temporal Ctx 


28.5 


Control (Path) 1 
Occipital Ctx 


100.0 


AD 5 Inf Temporal 
Ctx 


82.4 


Control (Path) 2 
Occipital Ctx 


7.3 


AD 5 SupTemporal 
Ctx 


46.0 


Control (Path) 3 
Occipital Ctx 


5.5 



23.3 


AD 6 Inf Temporal 
Ctx 


30.4 


Control (Tath) 4~ 
Occipital Ctx 
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AD 6 Sup Temporal • 2g 3 jControl 1 Parietal : g 4 j 

Ctx j ' jCtx j ' j 


Control 1 Temporal j ^ 5 
Ctx " | 
Control 2 Temporal j ^3 7 
Ctx ' j 


Control 2 Panetal ~ c _ 

ni i 25.3 « 

Ctx 

Control 3 Parietal ; ~~ « 
Ctx 


Control 3 Temporal 
Ctx 


32.3 


Control (Path) 1 * gJ g 
Parietal Ctx 

1 - . ~ ~~ . Ji -.^-.w^,^. ., V ,.-,«« vKK ,^.,- , 


Control 4 Temporal 

Ctx _ _____ 


jControl (Path) 2 { „ „ 
(Parietal Ctx j 


Control (Path) 1 
Temporal Ctx 

Control (Path) 2 
Temporal Ctx 


g07 [Control (Path) 3 \ 32 
{Parietal Ctx j 

" 7n c IControl (Path) 4 i 
1L ° jParietalCtx J 



Table C3. Panel 1.3D 



Tissue Name 

! nramropnir - trim n » 


Rel. Exp.(%) Ag2971, ! 
Run 166219829 


Tissue Name 


Rel. Exp.(%) Ag2971, 
Run 1 66219829 


Liver adenocarcinoma j 


4.2 


Kidney (fetal) 


1.0 


Pancreas 


0.0 


Renal ca. 786-0 


0.0 


Pancreatic ca. CAP AN 2 


0.0 


Renal ca. A498 


0.0 


Adrenal gland 


o.o 


RenafcaTRXF 393~ 


0.0 " 


Thyroid 


3.4^ 


Renal ca. ACHN 


0J) 

----- 


Salivary gland 


0.0 " 


Renal ca. UO-31 




Pituitary gland 


"4.3 ~~ 


Renal ca. TK40 


0.0 


Brain (fetal) 


33.4 " 


Liver 




Brain (whole) 


45.7 


Liver (fetal) 


21.0 


Brain (amygdala) 


18.2 


Liver ca. 

(hepatoblast) HepG2 


1.9 


Brain (cerebellum) 


1.5 


Lung 


0.0 


Brain (hippocampus) 


7.5 


Lung (fetal) 


1.1 


Brain (substantia nigra) 


17.8 


Lung ca. (small cell) 
LX-1 


0.0 


Brain (thalamus) 


14.9 


Lung ca. (small cell) 
NCI-H69 


0.0 


\ Cerebral Cortex 


22.1 


Lung ca. (s.cell var.) 
SHP-77 


0.0 


i Spinal cord 


4.9 


Lung ca. (large 
cell)NCI-H460 


0.0 


glio/astro U87-MG 


0.0 


Lung ca. (non-sm. 
cell) A549 


0.0 
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j 

olio/astro U-118-MG 1 


!Lung ca. (non-s.cell) j 
!NCI-H23 


0.0 


astrocytoma SW1 783 

* 

* 


n n iLung ca. (non-s.cell) ; 
]HOP-62 


3.9 


j 


neuro*- met SK-N-AS 


jLung ca. (non-s.cl) 
!nCI-H522 

. - . . \ . - . . ...... 


0.0 




astrocytoma SF-539 


i 

0.0 ! 


Lung ca. (squam.) 
SW 900 


0.0 


astrocytoma SNB-75 


0.0 


Lung ca. (squam.) 
NCI-H596 


0.0 

1 


glioma SNB-19 


0.0 


Mammary gland 


5.9 




0.0 


Breast ca.* (pl.ef) 
MCF-7 


0.0 




glioma SF-295 


0.0 

._. „ - - -J 


Breast ca.* (pl.ef) 
MDA-MB-231 


0.0 




Heart (fetal) 


15.1 


Breast ca.* (pl.ef) 
T47D 


0.0 


Heart 


0.0 


Breast ca. BT-549 


0.0 


Skeletal muscle (fetal) 


3.3 


Breast ca. MDA-N 


0.0 


Skeletal muscle 


0.8 


Ovary 


f 0.5 


Rone marrow 


ton n 


Ovarian ca. OVCAR- 
3 


0.0 


TTivmiiQ 


0.0 


Ovarian ca. OVCAR- 
4 


0.0 




Spleen 


2.0 


Ovarian ca. OVCAR- 
5 


0.0 




Lymph node 


0.0 


Ovarian ca. OVCAR- 
8 


7.3 


Colorectal 


4.2 


Ovarian ca. IGROV-1 


0.0 


Stomach 


0.0 


Ovarian ca.* (ascites) 
ht/ r\\T i 


57.0 


Small intestine 


1.1 


Uterus 


0.9 




Colon ca. SW480 


0.9 


Placenta 


27.9 


Colon ca.* 
SW620(SW480 met) 


0.0 


Prostate 


0.0 


Colon ca. HT29 


0.0 


Prostate ca.* (bone 
met)PC-3 


4.5 


Colon ca. HCT-116 


0.0 


Testis 


12.8 


Colon ca. CaCo-2 


0.0 


Melanoma 
Hs688(A).T 


0.0 


Colon ca. 


0.0 


Melanoma* (met) 


0.0 
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tissue(OD03866) 


JHs688(B).T j ( 


Colon ca. HCC-2998 


0.0 [Melanoma UACC-62 j 9.6 j 


Gastric ca.* (liver met) 
NCI-N87 


0.0 


Melanoma Ml 4 j 1.1 

. .. J . . - - _J 


Bladder 
Trachea 


0.0 
0.0 


Melanoma LOX 
IMVI \ 

Melanoma* (met) 
SK-MEL-5 


0.0 
0.0 


Kidney 


0.0 jAdipose 


2 : 5 ., 



Table C4. Panel 4D 







Rel. Exp.(%) : 




Rel. Exp.(%) 




Tissue Name 


Ag2971, Run 
164403109 


Tissue Name 


Ag2971, Run 
164403109 




Secondary Thl act 


~ 0.6" " 


HUVEC IL-lbeta 


0.0 


ni 


Secondary Th2 act 


0.4 


HUVEC BFN gamma 


0.7" 




Secondary Trl act 


0.0 


mjVECTNF alpha + IFN 
gamma 


0.0 




Secondary Thl rest 


0.0 


HUVEC TNF alpha + IL4 


0.0 




Secondary Th2 rest 


0.0 


HUVEC IL-11 , 


0.0 


pes 


Secondary Trl rest 


0.0 


Lung Microvascular EC 
none 


0.0 


f j 


Primary Thl act 


0.3 


Lung Microvascular EC 
TNFalpha + IL-lbeta 


A A 

0.0 




Pnmary Tnz act 


A A 

0.0 


Microvascular Dermal EC 
none 


0.0 




Primary Trl act 


0.0 


Microsvasular Dermal EC 
TNFalpha + IL-lbeta 


0.0 




Primary Thl rest 


1.5 


Bronchial epithelium 
TNFalpha + ILlbeta 


0.0 




Primary Th2 rest 


2.2 


Small airway epithelium 
none 


0.0 




Primary Trl rest 


0.4 


Small airway epithelium 
TNFalpha + IL-lbeta 


0.0 




CD45RA CD4 
lymphocyte act 


0.0 


Coronery artery SMC rest 


3.1 




CD45RO CD4 


0.8 


Coronery artery SMC 


0.5 




lymphocyte act 


TNFalpha + IL-lbeta 




CD8 lymphocyte act 


0.0 


Astrocytes rest 


0.0 




Secondary CD8 


0.4 


Astrocytes TNFalpha + IL- 


0.0 




lymphocyte rest 


lbeta 




Secondary CD8 




KU-812 (Basophil) rest 


\"'~ 47.0 



202 



lymphocyte act 

CD4 lymphocyte none 


Q1 ;KU-8 12 (Basophil) 
SPMA/ionomycin 


IZZ LZZi 

100.0 


2ryThl/Th2/Trl_anti- 
CD95 CHI I 

LAK cells rest 


0.1 
0.2 


CCD1106(Keratinocytes) j 
none j 

CCD1106(Keratinocytes) \ " 
TNFalpha+IL-lbeta 


LAK cells IL-2 


1.1 jLiver cirrhosis j 2.3 ; 


LAK cells IL-2+IL-12 


1.4 jLupus kidney J 0.0 


LAK cells IL-2+IFN 
gamma 


0.5 


NCI-H292 none 


0.0 


LAK cells IL-2+ IL-18 


0.6 


NCI-H292 IL-4 


0.0 


LAK cells 
PMA/ionomycin 


0.0 


NCI-H292 IL-9 


0.0 


NK Cells IL-2 rest 


0.5 


NCI-H292 IL-13 


0.0 


Two Way MLR 3 day 


0.2 


NCI-H292 IFN gamma 


0.0 


Two Way MLR 5 day 


0.0 


HPAEC none 


0.0 


Two Way MLR 7 day 


0.0 


HPAECTNF alpha + IL-1 
beta 

Lung fibroblast none 


0.0 


PBMC rest 
PBMC PWM 


~0.6 




1.4 


Lung fibroblast TNF alpha 
+ IL-1 beta 


0.2 


PBMC PHA-L 


1.1 


Lung fibroblast IL-4 


0.2 


Ramos (B cell) none 


0.0 


Lung fibroblast IL-9 


0.8 


Ramos (B cell) 
ionomycin 


0.0 


X /-*• t 1 1 j XX ^ 

Lung fibroblast IL-13 


Mm-**** ^^^.^^^ !S ^^^. 

0.0 


B lymphocytes PWM 


0.8 


Lung fibroblast IFN gamma 


0.2 


B lymphocytes CD40L 
and IL-4 


1.1 


Dermal fibroblast 
CCD 1 070 rest 


0.0 




1.1 


Dermal fibroblast 
CCD 1 070 TNF alpha 


4.5 


EOL-1 dbcAMP 
PMA/ionomycin 


0.0 


Dermal fibroblast 
CCD 1 070 IL-1 beta 


0.0 


Dendritic cells none 


0.0 


Dermal fibroblast IFN 
gamma 


0.0 


Dendntic cells LPS 


0.0 


Dermal fibroblast IL-4 


0.0 


Dendritic cells anti- 
CD40 


0.3 

! 
; 


IBD Colitis 2 


0.2 


Monocytes rest 


2.5 


IBD Crohn's 


0.0 


Monocytes LPS 


0.2 {Colon 


0.9 


Macrophages rest j 0.2 jLung 


0.3 
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Macrophages LPS j 0.2^ jThymus 

H^ECnon e""^^ TZ \° jKidney" 
HUVEC stored ~ "J™ 0 : (T J 

CNSneurodegenerationvl.O Summary: Ar2971 This panel confirms the expression 
of the CG56663-01 gene at low levels in the brain in an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's diseased 
postmortem brains and those of non-demented controls in this experiment. Please see Panel 1.3D 
5 for a discussion of the potential utility of this gene in treatment of central nervous system 
disorders. 

S Panel 1.3D Summary: Ag2971 Expression of the CG56663-01 gene is highest in bone 

W marrow (CT = 31.6). Interestingly, expression of this gene is significantly higher in fetal heart 

Hi (CT - 34.3) than adult heart (CT = 40) as well as in fetal liver (CT = 33.8) than adult liver (CT - 

m 

%0 40). This observation suggests that expression of this gene may be used to distinguish fetal from 

sfse 

03 adult heart and liver. 

Jt This gene is also expressed at low levels in several regions of the CNS examined, 

including amygdala, substantia nigra, thalamus and cerebral cortex. This gene encodes a novel G- 

SVs 
trar 

O protein coupled receptor (GPCR). The GPCR family of receptors contains a large number of 

' f5 neurotransmitter receptors, including the dopamine, serotonin, □ and □ -adrenergic, acetylcholine 
muscarinic, histamine, peptide, and metabotropic glutamate receptors. GPCRs are excellent drug 
targets in various neurologic and psychiatric diseases. All antipsychotics have been shown to act 
at the dopamine D2 receptor; similarly novel antipsychotics also act at the serotonergic receptor, 
and often the muscarinic and adrenergic receptors as well. While the majority of antidepressants 

20 can be classified as selective serotonin reuptake inhibitors, blockade of the 5-HT1A and D2 
adrenergic receptors increases the effects of these drugs. The GPCRs are also of use as drug 
targets in the treatment of stroke. Blockade of the glutamate receptors may decrease the neuronal 
death resulting from excitotoxicity; further more the purinergic receptors have also been 
implicated as drug targets in the treatment of cerebral ischemia. The □ -adrenergic receptors have 

25 been implicated in the treatment of ADHD with Ritalin, while the □ -adrenergic receptors have 
been implicated in memory. Therefore, this gene may be of use as a small molecule target for the 
treatment of any of the described diseases. 
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References: 



1. El Yacoubi M, Ledent C, Parmentier M, Bertorelli R, Ongini E, Costentin J, Vaugeois 
JM. Adenosine A2A receptor antagonists are potential antidepressants: evidence based on 
pharmacology and A2A receptor knockout mice. Br J Pharmacol 2001 Sep;134(l):68-77 

5 1. Adenosine, an ubiquitous neuromodulator, and its analogues have been shown to 

produce 'depressant' effects in animal models believed to be relevant to depressive disorders, 
while adenosine receptor antagonists have been found to reverse adenosine-mediated 'depressant' 
effect. 2. We have designed studies to assess whether adenosine A2A receptor antagonists, or 
genetic inactivation of the receptor would be effective in established screening procedures, such 

Q0 as tail suspension and forced swim tests, which are predictive of clinical antidepressant activity. 3. 

Tfi Adenosine A2A receptor knockout mice were found to be less sensitive to 'depressant' challenges 

!y than their wildtype littermates. Consistently, the adenosine A2A receptor blockers SCH 58261 (1 

01 

£ - 10 mg kg(-l), i.p.) and KW 6002 (0.1 - 10 mg kg(-l), p.o.) reduced the total immobility time in 
the tail suspension test. 4. The efficacy of adenosine A2A receptor antagonists in reducing 
Cp5 immobility time in the tail suspension test was confirmed and extended in two groups of mice, 
fl Specifically, SCH 58261 (1 - 10 mg kg(-l)) and ZM 241385 (15 - 60 mg kg(-l)) were effective in 
%i mice previously screened for having high immobility time, while SCH 58261 at 10 mg kg(-l) 
fij reduced immobility of mice that were selectively bred for their spontaneous 'helplessness' in this 
assay. 5. Additional experiments were carried out using the forced swim test. SCH 58261 at 10 
20 mg kg(-l) reduced the immobility time by 61%, while KW 6002 decreased the total immobility 
time at the doses of 1 and 10 mg kg(-l) by 75 and 79%, respectively. 6. Administration of the 
dopamine D2 receptor antagonist haloperidol (50 - 200 microg kg(-l) i.p.) prevented the 
antidepressant-like effects elicited by SCH 58261 (10 mg kg(-l) i.p.) in forced swim test whereas 
it left unaltered its stimulant motor effects. 7. In conclusion, these data support the hypothesis that 
25 A2A receptor antagonists prolong escape-directed behaviour in two screening tests for 
antidepressants. Altogether the results support the hypothesis that blockade of the adenosine A2A 
receptor might be an interesting target for the development of effective antidepressant agents. 

2. Blier P. Pharmacology of rapid-onset antidepressant treatment strategies. Clin 
Psychiatry 2001 ;62 Suppl 15:12-7 
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Although selective serotonin reuptake inhibitors (SSRIs) block serotonin (5-HT) reuptake 
rapidly, their therapeutic action is delayed. The increase in synaptic 5-HT activates feedback 
mechanisms mediated by 5-HT1A (cell body) and 5-HT1B (terminal) autoreceptors, which, 
respectively, reduce the firing in 5-HT neurons and decrease the amount of 5-HT released per 
5 action potential resulting in attenuated 5-HT neurotransmission. Long-term treatment desensitizes 
the inhibitory 5-HT1 autoreceptors, and 5-HT neurotransmission is enhanced. The time course of 
these events is similar to the delay of clinical action. The addition of pindolol, which blocks 5- 
HT1A receptors, to SSRI treatment decouples the feedback inhibition of 5-HT neuron firing and 
accelerates and enhances the antidepressant response. The neuronal circuitry of the 5-HT and 
LiO norepinephrine (NE) systems and their connections to forebrain areas believed to be involved in 
depression has been dissected. The firing of 5-HT neurons in the raphe nuclei is driven, at least 
Ul partly, by alphal-adrenoceptor-mediated excitatory inputs from NE neurons. Inhibitory alpha2- 
m adrenoceptors on the NE neuroterminals form part of a feedback control mechanism. Mirtazapine, 
3: an antagonist at alpha2-adrenoceptors, does not enhance 5-HT neurotransmission directly but 
A 5 disinhibits the NE activation of 5-HT neurons and thereby increases 5-HT neurotransmission by a 
£7 mechanism that does not require a time-dependent desensitization of receptors. These 
neurobiological phenomena may underlie the apparently faster onset of action of mirtazapine 
G compared with the SSRIs. 

3. Tranquillini ME, Reggiani A. Glycine-site antagonists and stroke. Expert Opin Investig 
20 Drugs 1999Nov;8(ll):1837-1848 

The excitatory amino acid, (S)-glutamic acid, plays an important role in controlling many 
neuronal processes. Its action is mediated by two main groups of receptors: the ionotropic 
receptors (which include NMDA, AMPA and kainic acid subtypes) and the metabotropic 
receptors (mGluR(l-8)) mediating G-protein coupled responses. This review focuses on the 

25 strychnine insensitive glycine binding site located on the NMDA receptor channel, and on the 
possible use of selective antagonists for the treatment of stroke. Stroke is a devastating disease 
caused by a sudden vascular accident. Neurochemically, a massive release of glutamate occurs in 
neuronal tissue; this overactivates the NMDA receptor, leading to increased intracellular calcium 
influx, which causes neuronal cell death through necrosis. NMDA receptor activation strongly 

30 depends upon the presence of glycine as a co-agonist. Therefore, the administration of a glycine 

antagonist can block overactivation of NMDA receptors, thus preserving neurones from damage. 
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The glycine antagonists currently identified can be divided into five main categories depending on 
their chemical structure: indoles, tetrahydroquinolines, benzoazepines, quinoxalinediones and 
pyrida-zinoq uinoli nes. 

4. Monopoli A, Lozza G, Forlani A, Mattavelli A, Ongini E. Blockade of adenosine A2A 
receptors by SCH 58261 results in neuroprotective effects in cerebral ischaemia in rats. 
Neuroreport 1998 Dec l;9(17):3955-9 

Blockade of adenosine receptors can reduce cerebral infarct size in the model of global 
ischaemia. Using the potent and selective A2A adenosine receptor antagonist, SCH 58261, we 
assessed whether A2A receptors are involved in the neuronal damage following focal cerebral 
ischaemia as induced by occluding the left middle cerebral artery. SCH 58261 (0.01 mg/kg either 
i.p. or i.v.) administered to normotensive rats 10 min after ischaemia markedly reduced cortical 
infarct volume as measured 24 h later (30% vs controls, p < 0.05). Similar effects were observed 
when SCH 58261 (0.01 mg/kg, i.p.) was administered to hypertensive rats (28% infarct volume 
reduction vs controls, p < 0.05). Neuroprotective properties of SCH 58261 administered after 
ischaemia indicate that blockade of A2A adenosine receptors is a potentially useful biological 
target for the reduction of brain injury. 

Panel 4D Summary: Ag2971 The CG56663-01 gene is expressed exclusively in the 
basophil cell line KU-812, irrespective of treatment with PMA and ionomycin. Thus, expression 
of this gene may be used to distinguish basophils from the other samples on this panel. This gene 
encodes a putative GPCR and it is known that GPCR-type receptors are important in multiple 
physiological responses mediated by basophils (ref. 1). Therefore, antibody or small molecule 
therapies designed with the protein encoded for by this gene could block or inhibit inflammation 
or tissue damage due to basophil activation in response to asthma, allergies, hypersensitivity 
reactions, psoriasis, and viral infections. 

Reference: 

1. Heinemann A., Hartnell A., Stubbs V.E., Murakami K., Soler D., LaRosa G., Askenase 
P.W., Williams T.J., Sabroe I. (2000) Basophil responses to chemokines are regulated by both 
sequential and cooperative receptor signaling. J. Immunol. 165: 7224-7233. 
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To investigate human basophil responses to chemokines, we have developed a sensitive 
assay that uses flow cytometry to measure leukocyte shape change as a marker of cell 
responsiveness. PBMC were isolated from the blood of volunteers. Basophils were identified as a 
single population of cells that stained positive for IL-3Ralpha (CDwl23) and negative for HLA- 
DR, and their increase in forward scatter (as a result of cell shape change) in response to 
chemokines was measured. Shape change responses of basophils to chemokines were highly 
reproducible, with a rank order of potency: monocyte chemoattractant protein (MCP) 4 (peak at /= 
eotaxin-2 = eotaxin-3 >/= eotaxin > MCP-1 = MCP-3 > macrophage-inflammatory protein- 1 alpha 
> RANTES = MCP-2 = IL-8. The CCR4-selective ligand macrophage-derived chemokine did not 
elicit a response at concentrations up to 10 nM. Blocking mAbs to CCR2 and CCR3 demonstrated 
that responses to higher concentrations (>10 nM) of MCP-1 were mediated by CCR3 rather than 
CCR2, whereas MCP-4 exhibited a biphasic response consistent with sequential activation of 
CCR3 at lower concentrations and CCR2 at 10 nM MCP-4 and above. In contrast, responses to 
MCP-3 were blocked only in the presence of both mAbs, but not after pretreatment with either 
anti-CCR2 or anti-CCR3 mAb alone. These patterns of receptor usage were different from those 
seen for eosinophils and monocytes. We suggest that cooperation between CCRs might be a 
mechanism for preferential recruitment of basophils, as occurs in tissue hypersensitivity responses 
in vivo. 

PMID: 11120855 

D, NOV9: Dual Specificity Phosphatase 

Expression of the NOV9 gene (CG5 6787-01) was assessed using the primer-probe set 
Ag3021, described in Table Dl. Results of the RTQ-PCR runs are shown in Tables D2, D3 and 
D4. 



"""" ™" <T" ™' "" r """' - MSM.H 

Primersf Sequences 


Length! Start Position 


Seq ID No. j 


Forward p'-aattgtttggcaagaacactgt-S' 


22 \ 


512 


95 " 


Probe fTET-S'-ccagtgggaatgatccctgacatcta-S'-TAMRA 


26 j 


550 


96 


Reverse 5'-atcatcaaacggacttccttct-3' 


..22 ; . 


578 


97 



Tissue Name J ReL Exp ' (%) Ag3021 ' 
tissue Name , Run 20 9821073 


*™™™™™™~***" 

Tissue Name 


ReL Exp.(%) Ag3021, 
Run 209821073 


AD 1 Hippo ! 3.3 


'Control (Path) 3 i 


1.0 
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AD 2 Hippo 



AD 3 Hippo 



AD 4 Hippo 
AD 5 Hippo 



i 



AD 6 Hippo 
Control 2 Hippo 



Control 4 Hippo 



Control (Path) 3 
Hippo 



AD 1 Temporal Ctx 



AD 2 Temporal Ctx 



AD 3 Temporal Ctx 



AD 4 Temporal Ctx 



AD 5 Inf Temporal 

Ctx 

AD 5 Sup Temporal 
Ctx 



AD 6 Inf Temporal 
Ctx 



AD 6 Sup Temporal 
Ctx 



Control 1 Temporal 
Ctx 



Control 2 Temporal 
Ctx 



Control 3 Temporal 
Ctx 



4- 



Control 3 Temporal 
Ctx 



Control (Path) 1 
Temporal Ctx 



Control (Path) 2 
Temporal Ctx 



5.8 



VL 

1.2 

12.6 
K)3~ 

~3.9~ 
2.7 



4.3 



5.1 



1.8 



4.2 



18.8 



13.3 



13.0 
12.2 



Control (Path) 4 
Temporal Ctx 



SAD 1 Occip ital Ctx ; 

jAD 2 Occipital Ctx j 

J(Missing) j 

jAD 3 Occipital Ctx j 

| AD 4 Occipital Ctx j 

JAD 5 Occipital Ctx ] 
{AD 6 Occipital Ctx 
jControl 1 Occipital 
iCtx 




2.2 



0.2 



Control 2 Occipital 
Ctx 



Control 3 Occipital 

!Ctx 



Control 4 Occipital 
Ctx 

Control (Path) 1 
Occipital Ctx 



Control (Path) 2 
Occipital Ctx 



Control (Path) 3 
Occipital Ctx 



Control (Path) 4 
Occipital Ctx 
Control 1 Parietal 
Ctx 



Control 2 Parietal 



jControl 3 Parietal 
Ctx 



10.7 



4.5 



Control (Path) 1 
Parietal Ctx 

Control (Path) 2 
Parietal Ctx 



Control (Path) 3 
Parietal Ctx 



Control (Path) 4 
Parietal Ctx 



5.5 

o.o 

2.5 " 
100.0 

"~4.2 
"5.3" 

1.4 



5.0 



2.7 



3.0 
13.4 



1.6 



1.1 



3.0 
1.5 
8.2 



2.9 



9.1 



5.9 



0.6 



6.8 
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TahleD3^anelL3P 



Tissue Name 

Liver adenocarcinoma 
Pancreas 


Rel. A^021, | " " ^ Name j 
Run 167966916 ! J 

14.2 jKidney (fetal) 
4.4 jRenal ca. 786-0 


Rel. Exp.(%) Ag3021, 
Run 167966916 

"24.3 

15.0 


Pancreatic ca. CAPAN 2\ 
Adrenal gland 
Thyroid 

ociii.va.i_y ^icuiu. 

Pitnitarv pland 
Rrain (TetaFi 
Brain (whole) 

j Brain (amygdala) 


6.0 11 
2.6 j] 

^4 I] 

15.7 t; 

i 

60.7 J 
11.5 

11.9 


Renal ca. A498 j 
Renal ca. RXF 393 j 
Renal ca. ACHN 
Renal ca UO-3 1 
Renal ca. TK- 10 
Liver 

Liver (fetal) 
Liver ca. 

(hepatoblast) HepG2 


io.o 

15.2 
2.8 " 
10.2 
13.5 

' 2.1 " 

~oj] " 

2.5 


Brain (cerebellum) 


5.8 


Lung 


10.7 


Brain (hippocampus) 


10.0 


Lung (fetal) 


19.1 


| Brain (substantia nigra) 


22.1 


Lung ca. (small cell) j 
LX-1 


21.8 


| Brain (thalamus) 


10.9 


Lung ca. (small cell) 
NCI-H69 


20.2 


j Cerebral Cortex 


10.7 i 


Lung ca. (s.cell var.) 
SHP-77 


31.6 


| Spinal cord 


11.5 


Lung ca. (large 
cell)NCI-H460 


1.8 


glio/astro U87-MG 


26.8 


Lung ca. (non-sm. 
cell) A549 


20.0 


glio/astro U-118-MG 


23.5 


Lung ca. (non-s.cell) 
NCI-H23 


16.0 


[astrocytoma SW1783 


24.0 


Lung ca. (non-s.cell) 
HOP-62 


9.3 


neuro*;metSK-N-AS 


9.7 


Lung ca. (non-s.cl) 
NCI-H522 


12.0 


1 astrocytoma SF-539 


9.5 


Lung ca. (squam.) 
SW 900 


17.6 


| astrocytoma SNB-75 


20.4 


i Lung ca. (squam.) 
NCI-H596 


24.3 


glioma SNB- 19 


6.5 jMammary gland 


11.0 


glioma U251 


16.2 
29.9 


IBreast ca.* (pl.ef) 
JMCF-7 


16.6 


glioma SF-295 


jBreast ca.* (pl.ef) 
jMDA-MB-231 


8.7 
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Heart (fetal) j 6.6 


iBreast ca.* (pl.ef) '. 
jT47D I 




40.1 


Heart j 6.1 


[Breast ca. BT-549 j 


7.0 


Skeletal muscle (fetal) j 4.6 j 


Breast ca. MDA-N 




0.9 i 


Skeletal muscle j 


4.0 


- ' --- - — i 

Ovary 


9.5 


Bone marrow j 6.4 


Ovarian ca. OVCAR- 1 

3 i 


! 

7.3 | 


Thymus 


17.2 


Ovarian ca. OVCAR- ; 
4 


Spleen 


6.6 


Ovarian ca. OVCAR- : 
5 


100.0 


Lymph node 


17.1 ; 


Ovarian ca. OVCAR- 
8 


5.9 


Colorectal 


20.7 


Ovarian ca. IGROV-1 


4.3 


Stomach 


6.7 


Ovarian ca.* (ascites) 
SK-OV-3 


47.3 


Small intestine 


5.5 




Uterus 




8.1 


aimcIsW480 


5.3 


Placenta 


" b.o 


Colon ca.* 
SW620(SW480 met) 


27.7 


ProQtatP 


1.6 


Colon ca. HT29 


12.5 


Prostate ca.* (bone 
met)PC-3 


9.8 


Colon ca.HCT-116 


12.1 


Testis 


14.2 


Colon ca. CaCo-2 


20.4 


Melanoma 
Hs688(A).T 


3.8 


Colon ca. 
tissue(OD03866) 


18.3 


Melanoma* (met) 
Hs688(B).T 


2.7 


Colon ca. HCC-2998 


19.8 


Melanoma UACC-62 


11.5 
3.5 


Gastric ca.* (liver met) 
NCI-N87 


44.4 


Melanoma M14 


Bladder 


10.9 


Melanoma LOX 
IMVI 


9.7 


Trachea 


12.4 


Melanoma* (met) 
SK-MEL-5 


15.1 


Kidney 


5.8 


Adipose 


11.0 


Table D4. Panel 4D 


Tissue Name 


Rel. Exp.(%) 
Ag3021, Run 
164528127 


Tissue Name 


Rel. Exp.(%) 
Ag3021, Run 
164528127 


Secondary Thl act 


15.1 


HUVEC IL-lbeta 


11.1 
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^prnriHarv Th2 act ! 


16.5 


:HUVEC IFN gamma \ 


20.6 j 


Secondary Trl act 


15.5 


iHUVEC TNF alpha + 1FN j 
Jgamma 


25.9 j 


Secondary Thl rest j 


4.5 


THUVEC TNF alpha + IL4 1 


22.2 1 


f " - *-* " w ~* *f 

Secondary Th2 rest j 


O.Z 


*TTT TVPP TT 1 1 ! 


18 3 ! 


Secondary Trl rest j 

i 


7.6 


*Lung Microvascular EC 

? ;none j 


24.3 


Primary Thl act \ 


6.6 


iLung Microvascular EC j 
j i iNr aipna t- ll,- i oeia 


27.7 


Primary Th2 act 


10.1 


jMicrovascular Dermal EC j 
I none 


33.2 


Primary Trl act 


15.6 


jMicrosvasular Dermal EC 
jTNFalpha + IL-lbeta J 


27.7 


Primary Thl rest 


34.4 


bronchial epithelium 
iTNFalpha + ILlbeta 


16.4 


Primary Th2 rest 


14.3 




Small airway epithelium 
none 


8.4 


Primary Trl rest 


7.1 


Small airway epithelium 
TNFalpha + IL-lbeta 


100.0 


CD45RA CD4 
lymphocyte act 


4.4 


Coronery artery SMC rest 


10.4 


|CD45RO CD4 
lymphocyte act 


16.3 


Coronery artery SMC 
TNFalpha + IL-lbeta 


4.9 


|CD8 lymphocyte act 


9.0 




Astrocytes rest 




(Secondary CD8 
lymphocyte rest 


15.4 




Astrocytes TNFalpha + IL- 
lbeta 


13.3 


Secondary CD8 
lymphocyte act 


6.7 


KU-812 (Basophil) rest 


3.9 


CD4 lymphocyte none 


6.4 


KU-8 12 (Basophil) 
PMA/ionomycin 


19.8 


|2ry Thl/Th2/Trl_anti- 


11.6 




CCD1 106 (Keratinocytes) 


6.8 






none 


LAK cells rest 


18.2 




CCD1106 (Keratinocytes) 
TNFalpha + IL-lbeta 


4.6 


JLAJv CeilS ILr-Z 


16.6 


Liver cirrhosis 


3.1 


LAK cells IL-2+IL-12 


11.7 


Lupus kidney 


3-0 


[LAK cells IL-2+IFN 
gamma 


25.9 


NCI-H292 none 


7.1 


LAK cells IL-2+ IL-18 


18.3 


NCI-H292 IL-4 


7.2 


LAK cells 
PMA/ionomycin 


8.8 


!NCI-H292 IL-9 


8.8 
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NK Cells IL-2 rest \ 12.2 ;NCI-H292 IL-13 I 5.0 j 

........................ ......... - • ...... . ..J • • - - . . „ . . v 1 


Two Way MLR 3 day , 


1 5.4 jNCI-H292 IFN gamma 


4.0 t 


Two Way MLR 5 day 


11.8 jHPAECnone 


202 \ 


Two Way MLR 7 day 


IHPAEC TNF alpha + IL-1 
ibeta 


27.2 


PBMC rest 


7.1 -Lung fibroblast none 


4.9 


PBMC PWM 

PBMC PHA-L " 

Ramos (B cell) none 

Ramos (B cell) 
ionomycin 


55.5 

25.9 
9.9 

33.4 


Lung fibroblast TNF alpha 
+ IL-1 beta 

Lung fibroblast IL-4 

Lung fibroblast IL-9 

Lung fibroblast IL-13 


1 

4.1 

19.1 
12.5 

1? R 

IZr.O 


B lymphocytes PWM 
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CNSneurodegenerationvl.O Summary: Ag3021 This panel confirms the expression 
of the CG56787-01 gene at low levels in the brain in an independent group of individuals. 
However, no differential expression of this gene was detected between Alzheimer's diseased 
postmortem brains and those of non-demented controls in this experiment. Please see Panel 1.3D 



5 for a discussion of the potential utility of this gene in treatment of central nervous system 
disorders. 

Panel L3D Summary: Ag3021 Expression of the CG56787-01 gene is highest in a 

sample derived from ovarian cancer cell line OVCAR-5 (CT = 30). In addition, there is 

213 



substantial expression of this gene associated with other ovarian cancer cell lines as well as a 
breast cancer cell line. Thus, the expression of this gene could be used to distinguish OVCAR-5 
cells from other samples in the panel. Moreover, therapeutic modulation of the activity of this 
gene or its protein product, through the use of small molecule drugs, protein therapeutics or 
antibodies, might be beneficial for the treatment of ovarian or breast cancer. 

This gene is expressed at low levels in all regions of the CNS examined, including 
amygdala, cerebellum, hippocampus, substantia nigra, cerebral cortex, thalamus and spinal cord. 
This gene encodes a protein with homology to dual-specificity phosphatases. Dual-specificity 
phosphatases comprise a family of MAP kinase-regulating enzymes that are upregulated in brains 
subjected to insults such as ischemia and seizure activity. MAP kinases are known to regulate 
neurotrophic and neurotoxic pathways. Consequently, agents that modulate the activity of 
CG56787-01 may have utility in attenuating the apoptotic and neurodegenerative processes 
following brain insults. 

This gene is also expressed at low levels (CTs = 33-34) in pancreas, thyroid, pituitary 
gland, adult and fetal heart, adult and fetal skeletal muscle, and adipose. Thus, this novel protein 
phosphatase may be a target for small molecule drugs in the treatment of metabolic and endocrine 
diseases, including obesity and diabetes. 

References: 

1. Wiessner C. The dual specificity phosphatase PAC-1 is transcriptionally induced in the 
rat brain following transient forebrain ischemia. Brain Res Mol Brain Res 1995 Feb;28(2):353-6 

PAC-1 mRNA has previously been found only in activated T-cells in vitro and in vivo. 
The gene encodes a dual specificity protein phosphatase that regulates MAP kinase activity. Here, 
I describe that PAC-1 mRNA is induced also in neurons in the rat brain following 30 min of 
forebrain ischemia. At 6, 12 and 24 h after ischemia, PAC-1 mRNA was found most prominently 
in hippocampal cells which are resistant to 30 min of forebrain ischemia, but not in the selectively 
vulnerable CA1 sector. At later time points and in control animals no PAC-1 mRNA could be 
detected in any brain region. The protein-tyrosine/threonine phosphatase PAC-1, therefore, may 
be involved in adaptational responses of hippocampal cells resistant to ischemic injury. 
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2. Boschert U, Muda M, Camps M, Dickinson R, Arkinstall S. Induction of the dual 
specificity phosphatase PAC1 in rat brain following seizure activity. Neuroreport 1997 Sep 
29;8(14):3077-80 

Recurrent seizure activity leads to delayed neuronal death as well as to inflammatory 
responses involving microglia in hippocampal subfields CA1, CA3 and CA4. Since mitogen 
activated protein (MAP) kinases control neuronal apoptosis and trigger generation of 
inflammatory cytokines, their activation state could determine seizure-related brain damage. 
PAC1 is a dual specificity protein phosphatase inactivating MAP kinases which we have found to 
be undetectable in normal brain. Despite this, kainic acid-induced seizure activity lead to rapid 
(approximately 3 h) but transient appearance of PAC1 mRNA in granule cells of the dentate gyrus 
as well as in pyramidal CA1 neurons. This pattern changed with time and after 2-3 days PAC1 
was induced in dying CA1 and CA3 neurons. At this time PAC1 mRNA was also expressed in 
white matter microglia as well as in microglia invading the damaged hippocampus. PAC1 may 
play an important role controlling MAP kinase involvement in both neuronal death and neuro- 
inflammation following excitotoxic damage. 

3. Muda M, Boschert U, Dickinson R, Martinou JC, Martinou I, Camps M, Schlegel W, 
Arkinstall S. MKP-3, a novel cytosolic protein-tyrosine phosphatase that exemplifies a new class 
of mitogen-activated protein kinase phosphatase. J Biol Chem 1996 Feb 23;271(8):4319-26 

MKP-1 (also known as CL100, 3CH134, Erp, and hVH-1) exemplifies a class of dual- 
specificity phosphatase able to reverse the activation of mitogen-activated protein (MAP) kinase 
family members by dephosphorylating critical tyrosine and threonine residues. We now report the 
cloning of MKP-3, a novel protein phosphatase that also suppresses MAP kinase activation state. 
The deduced amino acid sequence of MKP-3 is 36% identical to MKP-1 and contains the 
characteristic extended active-site sequence motif VXVHCXXGXSRSXTXXXAYLM (where X 
is any amino acid) as well as two N-terminal CH2 domains displaying homology to the cell cycle 
regulator Cdc25 phosphatase. When expressed in COS-7 cells, MKP-3 blocks both the 
phosphorylation and enzymatic activation of ERK2 by mitogens. Northern analysis reveals a 
single mRNA species of 2.7 kilobases with an expression pattern distinct from other dual- 
specificity phosphatases. MKP-3 is expressed in lung, heart, brain, and kidney, but not 
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significantly in skeletal muscle or testis. In situ hybridization studies of MKP-3 in brain reveal 
enrichment within the CA1, CA3, and CA4 layers of the hippocampus. 

Panel 4D Summary: Ag3021 The CG56787-01 gene is expressed at low to moderate 
levels in all tissues examined except IBD colitis and Crohn's. This gene encodes a putative dual 
specificity phosphatase that may be important in maintaining normal cellular homeostasis in a 
wide range of tissues. Therapies designed with the protein encoded for by this transcript could be 
important in the treatment of diseases, such as IBD and Crohn's disease that show reduce the 
expression of this transcript. 
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OTHER EMBODIMENTS 

Although particular embodiments have been disclosed herein in detail, this has been done 
by way of example for purposes of illustration only, and is not intended to be limiting with respect 
to the scope of the appended claims, which follow. In particular, it is contemplated by the 
inventors that various substitutions, alterations, and modifications may be made to the invention 
without departing from the spirit and scope of the invention as defined by the claims. The choice 
of nucleic acid starting material, clone of interest, or library type is believed to be a matter of 
routine for a person of ordinary skill in the art with knowledge of the embodiments described 
herein. Other aspects, advantages, and modifications considered to be within the scope of the 
following claims. 
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